✨Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos
📝 Summary:
Ego2Web introduces the first benchmark bridging egocentric video perception and web agent execution, enabling evaluation of AI agents that can perceive physical surroundings and perform online tasks t...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22529
• PDF: https://arxiv.org/pdf/2603.22529
• Project Page: https://ego2web.github.io/
• Github: https://ego2web.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Ego2Web introduces the first benchmark bridging egocentric video perception and web agent execution, enabling evaluation of AI agents that can perceive physical surroundings and perform online tasks t...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22529
• PDF: https://arxiv.org/pdf/2603.22529
• Project Page: https://ego2web.github.io/
• Github: https://ego2web.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding
📝 Summary:
MinerU-Diffusion is a diffusion-based framework that replaces autoregressive decoding with parallel diffusion denoising for document OCR, improving robustness and decoding speed. AI-generated summary ...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22458
• PDF: https://arxiv.org/pdf/2603.22458
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MinerU-Diffusion is a diffusion-based framework that replaces autoregressive decoding with parallel diffusion denoising for document OCR, improving robustness and decoding speed. AI-generated summary ...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22458
• PDF: https://arxiv.org/pdf/2603.22458
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs
📝 Summary:
Reinforcement learning with verifiable rewards induces sparse, targeted changes in token distributions that can be systematically analyzed through distributional shifts and cross-sampling intervention...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22446
• PDF: https://arxiv.org/pdf/2603.22446
• Project Page: https://qwen-pilot.notion.site/rlvr-theseus
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Reinforcement learning with verifiable rewards induces sparse, targeted changes in token distributions that can be systematically analyzed through distributional shifts and cross-sampling intervention...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22446
• PDF: https://arxiv.org/pdf/2603.22446
• Project Page: https://qwen-pilot.notion.site/rlvr-theseus
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
✨RealMaster: Lifting Rendered Scenes into Photorealistic Video
📝 Summary:
RealMaster combines video diffusion models with 3D engine outputs to generate photorealistic videos that maintain geometric accuracy and scene consistency through paired training and IC-LoRA distillat...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23462
• PDF: https://arxiv.org/pdf/2603.23462
• Project Page: https://danacohen95.github.io/RealMaster/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
RealMaster combines video diffusion models with 3D engine outputs to generate photorealistic videos that maintain geometric accuracy and scene consistency through paired training and IC-LoRA distillat...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23462
• PDF: https://arxiv.org/pdf/2603.23462
• Project Page: https://danacohen95.github.io/RealMaster/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models
📝 Summary:
Hyperbolic vision-language models are enhanced through uncertainty-guided compositional alignment that improves hierarchical structure representation and multi-object scene understanding. AI-generated...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22042
• PDF: https://arxiv.org/pdf/2603.22042
• Project Page: https://jeeit17.github.io/UNCHA-project_page/
• Github: https://github.com/jeeit17/UNCHA
🔹 Models citing this paper:
• https://huggingface.co/hayeonkim/uncha
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Hyperbolic vision-language models are enhanced through uncertainty-guided compositional alignment that improves hierarchical structure representation and multi-object scene understanding. AI-generated...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22042
• PDF: https://arxiv.org/pdf/2603.22042
• Project Page: https://jeeit17.github.io/UNCHA-project_page/
• Github: https://github.com/jeeit17/UNCHA
🔹 Models citing this paper:
• https://huggingface.co/hayeonkim/uncha
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
✨SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM
📝 Summary:
SIMART is a unified MLLM that generates sim-ready articulated 3D assets by jointly decomposing parts and predicting kinematics. Its Sparse 3D VQ-VAE significantly reduces 3D token overhead, enabling high-fidelity multi-part assemblies for physics simulation.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2603.23386
• PDF: https://arxiv.org/pdf/2603.23386
• Project Page: https://simart-mllm.github.io/
• Github: https://simart-mllm.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
SIMART is a unified MLLM that generates sim-ready articulated 3D assets by jointly decomposing parts and predicting kinematics. Its Sparse 3D VQ-VAE significantly reduces 3D token overhead, enabling high-fidelity multi-part assemblies for physics simulation.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2603.23386
• PDF: https://arxiv.org/pdf/2603.23386
• Project Page: https://simart-mllm.github.io/
• Github: https://simart-mllm.github.io/
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models
📝 Summary:
Large language models exhibit post-conventional moral reasoning patterns inconsistent with human developmental trajectories, showing systematic logical incoherence and rhetorical sophistication withou...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21854
• PDF: https://arxiv.org/pdf/2603.21854
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Large language models exhibit post-conventional moral reasoning patterns inconsistent with human developmental trajectories, showing systematic logical incoherence and rhetorical sophistication withou...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21854
• PDF: https://arxiv.org/pdf/2603.21854
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs
📝 Summary:
Video-Action Models struggle in contact-rich tasks as vision alone lacks fine force details. The Video-Tactile Action Model VTAM integrates tactile perception with visual streams via multimodal fusion. VTAM significantly improves contact-rich manipulation by correcting visual errors, enabling rob...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23481
• PDF: https://arxiv.org/pdf/2603.23481
• Project Page: https://plan-lab.github.io/projects/vtam
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Video-Action Models struggle in contact-rich tasks as vision alone lacks fine force details. The Video-Tactile Action Model VTAM integrates tactile perception with visual streams via multimodal fusion. VTAM significantly improves contact-rich manipulation by correcting visual errors, enabling rob...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23481
• PDF: https://arxiv.org/pdf/2603.23481
• Project Page: https://plan-lab.github.io/projects/vtam
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Session Risk Memory (SRM): Temporal Authorization for Deterministic Pre-Execution Safety Gates
📝 Summary:
Session Risk Memory SRM enhances authorization by evaluating agent behavior over time, addressing distributed attacks. It uses semantic centroids and risk accumulation to achieve perfect detection with zero false positives, eliminating issues with stateless systems.
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22350
• PDF: https://arxiv.org/pdf/2603.22350
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Cybersecurity #TemporalAuthorization #DistributedSystems #BehavioralAnalytics #RiskDetection
📝 Summary:
Session Risk Memory SRM enhances authorization by evaluating agent behavior over time, addressing distributed attacks. It uses semantic centroids and risk accumulation to achieve perfect detection with zero false positives, eliminating issues with stateless systems.
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22350
• PDF: https://arxiv.org/pdf/2603.22350
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#Cybersecurity #TemporalAuthorization #DistributedSystems #BehavioralAnalytics #RiskDetection
Media is too big
VIEW IN TELEGRAM
✨2Xplat: Two Experts Are Better Than One Generalist
📝 Summary:
2Xplat proposes a two-expert architecture for pose-free 3D Gaussian Splatting. It explicitly separates geometry estimation from appearance synthesis, outperforming unified methods and matching state-of-the-art performance with less training.
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21064
• PDF: https://arxiv.org/pdf/2603.21064
• Project Page: https://hwasikjeong.github.io/2Xplat
• Github: https://github.com/HwasikJeong/2Xplat
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GaussianSplatting #3DReconstruction #ComputerVision #AI #DeepLearning
📝 Summary:
2Xplat proposes a two-expert architecture for pose-free 3D Gaussian Splatting. It explicitly separates geometry estimation from appearance synthesis, outperforming unified methods and matching state-of-the-art performance with less training.
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21064
• PDF: https://arxiv.org/pdf/2603.21064
• Project Page: https://hwasikjeong.github.io/2Xplat
• Github: https://github.com/HwasikJeong/2Xplat
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#GaussianSplatting #3DReconstruction #ComputerVision #AI #DeepLearning
✨DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models
📝 Summary:
Traditional optical flow models fail on corrupted real-world videos. This paper introduces DA-Flow, a new method that leverages corruption-aware features from spatio-temporally enhanced diffusion models. Fusing these with convolutional features, DA-Flow significantly improves performance on degra...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23499
• PDF: https://arxiv.org/pdf/2603.23499
• Project Page: https://cvlab-kaist.github.io/DA-Flow/
• Github: https://github.com/cvlab-kaist/DA-Flow
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#OpticalFlow #DiffusionModels #ComputerVision #DeepLearning #AI
📝 Summary:
Traditional optical flow models fail on corrupted real-world videos. This paper introduces DA-Flow, a new method that leverages corruption-aware features from spatio-temporally enhanced diffusion models. Fusing these with convolutional features, DA-Flow significantly improves performance on degra...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23499
• PDF: https://arxiv.org/pdf/2603.23499
• Project Page: https://cvlab-kaist.github.io/DA-Flow/
• Github: https://github.com/cvlab-kaist/DA-Flow
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#OpticalFlow #DiffusionModels #ComputerVision #DeepLearning #AI
✨VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models
📝 Summary:
VP-VLA is a dual-system framework that separates high-level task planning from low-level robotic control. It uses visual prompts like bounding boxes to guide the controller, improving spatial precision and robustness in vision-language-action tasks. This approach outperforms existing VLA models.
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22003
• PDF: https://arxiv.org/pdf/2603.22003
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguageAction #Robotics #VisualPrompting #AIResearch #MachineLearning
📝 Summary:
VP-VLA is a dual-system framework that separates high-level task planning from low-level robotic control. It uses visual prompts like bounding boxes to guide the controller, improving spatial precision and robustness in vision-language-action tasks. This approach outperforms existing VLA models.
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22003
• PDF: https://arxiv.org/pdf/2603.22003
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VisionLanguageAction #Robotics #VisualPrompting #AIResearch #MachineLearning
✨Regulating AI Agents
📝 Summary:
The EU AI Act struggles to regulate autonomous AI agents due to gaps in its framework. This paper analyzes the Act's provisions and institutional setups, finding them ill-suited for these new systems. Policymakers must adapt to effectively govern next generation AI technology.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23471
• PDF: https://arxiv.org/pdf/2603.23471
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AILaw #AIGovernance #EUAIACT #AutonomousAI #TechPolicy
📝 Summary:
The EU AI Act struggles to regulate autonomous AI agents due to gaps in its framework. This paper analyzes the Act's provisions and institutional setups, finding them ill-suited for these new systems. Policymakers must adapt to effectively govern next generation AI technology.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23471
• PDF: https://arxiv.org/pdf/2603.23471
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AILaw #AIGovernance #EUAIACT #AutonomousAI #TechPolicy
✨VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions
📝 Summary:
VISOR improves LVLM efficiency by sparsifying image-text interactions using strategically placed, dynamic attention layers. This allows high-resolution reasoning on demand, significantly reducing computational cost while matching state-of-the-art performance on complex visual tasks.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23495
• PDF: https://arxiv.org/pdf/2603.23495
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VLLM #VisionLanguageAI #AIEfficiency #DeepLearning #AIResearch
📝 Summary:
VISOR improves LVLM efficiency by sparsifying image-text interactions using strategically placed, dynamic attention layers. This allows high-resolution reasoning on demand, significantly reducing computational cost while matching state-of-the-art performance on complex visual tasks.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23495
• PDF: https://arxiv.org/pdf/2603.23495
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VLLM #VisionLanguageAI #AIEfficiency #DeepLearning #AIResearch
This media is not supported in your browser
VIEW IN TELEGRAM
✨InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields
📝 Summary:
InfiniDepth introduces neural implicit fields for continuous 2D depth querying, overcoming limitations of discrete grid methods. This enables arbitrary-resolution and fine-grained depth estimation, achieving state-of-the-art performance, particularly in fine-detail regions and for novel view synt...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/infinidepth-arbitrary-resolution-and-fine-grained-depth-estimation-with-neural-implicit-fields
• PDF: https://arxiv.org/pdf/2601.03252
• Project Page: https://zju3dv.github.io/InfiniDepth
• Github: https://zju3dv.github.io/InfiniDepth
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DepthEstimation #NeuralImplicitFields #ComputerVision #AI #3DGraphics
📝 Summary:
InfiniDepth introduces neural implicit fields for continuous 2D depth querying, overcoming limitations of discrete grid methods. This enables arbitrary-resolution and fine-grained depth estimation, achieving state-of-the-art performance, particularly in fine-detail regions and for novel view synt...
🔹 Publication Date: Published on Jan 6
🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/infinidepth-arbitrary-resolution-and-fine-grained-depth-estimation-with-neural-implicit-fields
• PDF: https://arxiv.org/pdf/2601.03252
• Project Page: https://zju3dv.github.io/InfiniDepth
• Github: https://zju3dv.github.io/InfiniDepth
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#DepthEstimation #NeuralImplicitFields #ComputerVision #AI #3DGraphics
❤1
✨STEM Agent: A Self-Adapting, Tool-Enabled, Extensible Architecture for Multi-Protocol AI Agent Systems
📝 Summary:
STEM Agent is a self-adapting, modular AI architecture. Inspired by biology, it dynamically differentiates components for diverse interaction protocols, tool integration, and user modeling, solving fixed framework limitations.
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22359
• PDF: https://arxiv.org/pdf/2603.22359
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AIAgents #AIArchitecture #AdaptiveAI #ToolIntegration #AIResearch
📝 Summary:
STEM Agent is a self-adapting, modular AI architecture. Inspired by biology, it dynamically differentiates components for diverse interaction protocols, tool integration, and user modeling, solving fixed framework limitations.
🔹 Publication Date: Published on Mar 22
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22359
• PDF: https://arxiv.org/pdf/2603.22359
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AIAgents #AIArchitecture #AdaptiveAI #ToolIntegration #AIResearch
✨Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning
📝 Summary:
SlotCurri addresses video object over-fragmentation using a reconstruction-guided slot curriculum. It progressively allocates slots, employs a structure-aware loss for sharp boundaries, and uses cyclic inference for temporal consistency. This method significantly improves object decomposition.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22758
• PDF: https://arxiv.org/pdf/2603.22758
• Github: https://github.com/wjun0830/SlotCurri
🔹 Models citing this paper:
• https://huggingface.co/WJ0830/SlotCurri
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoAI #ObjectCentricLearning #ComputerVision #DeepLearning #ObjectSegmentation
📝 Summary:
SlotCurri addresses video object over-fragmentation using a reconstruction-guided slot curriculum. It progressively allocates slots, employs a structure-aware loss for sharp boundaries, and uses cyclic inference for temporal consistency. This method significantly improves object decomposition.
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22758
• PDF: https://arxiv.org/pdf/2603.22758
• Github: https://github.com/wjun0830/SlotCurri
🔹 Models citing this paper:
• https://huggingface.co/WJ0830/SlotCurri
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#VideoAI #ObjectCentricLearning #ComputerVision #DeepLearning #ObjectSegmentation
✨Logics-Parsing Technical Report
📝 Summary:
Logics-Parsing is an end-to-end LVLM enhanced with reinforcement learning to improve document parsing. It optimizes layout analysis and reading order inference, achieving state-of-the-art performance on diverse document types across a new benchmark.
🔹 Publication Date: Published on Sep 24, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19760
• PDF: https://arxiv.org/pdf/2509.19760
• Github: https://github.com/alibaba/Logics-Parsing
🔹 Models citing this paper:
• https://huggingface.co/Logics-MLLM/Logics-Parsing
• https://huggingface.co/Mungert/Logics-Parsing-GGUF
✨ Spaces citing this paper:
• https://huggingface.co/spaces/prithivMLmods/VLM-Parsing
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Logics-Parsing is an end-to-end LVLM enhanced with reinforcement learning to improve document parsing. It optimizes layout analysis and reading order inference, achieving state-of-the-art performance on diverse document types across a new benchmark.
🔹 Publication Date: Published on Sep 24, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19760
• PDF: https://arxiv.org/pdf/2509.19760
• Github: https://github.com/alibaba/Logics-Parsing
🔹 Models citing this paper:
• https://huggingface.co/Logics-MLLM/Logics-Parsing
• https://huggingface.co/Mungert/Logics-Parsing-GGUF
✨ Spaces citing this paper:
• https://huggingface.co/spaces/prithivMLmods/VLM-Parsing
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨One View Is Enough! Monocular Training for In-the-Wild Novel View Generation
📝 Summary:
OVIE enables monocular novel-view synthesis from single images by generating pseudo-target views via a geometric scaffold. This eliminates the need for multi-view supervision, allowing training on massive unpaired datasets. OVIE achieves superior zero-shot performance and is significantly faster ...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23488
• PDF: https://arxiv.org/pdf/2603.23488
• Github: https://github.com/AdrienRR/ovie
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#NovelViewSynthesis #MonocularVision #ComputerVision #DeepLearning #3DVision
📝 Summary:
OVIE enables monocular novel-view synthesis from single images by generating pseudo-target views via a geometric scaffold. This eliminates the need for multi-view supervision, allowing training on massive unpaired datasets. OVIE achieves superior zero-shot performance and is significantly faster ...
🔹 Publication Date: Published on Mar 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23488
• PDF: https://arxiv.org/pdf/2603.23488
• Github: https://github.com/AdrienRR/ovie
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#NovelViewSynthesis #MonocularVision #ComputerVision #DeepLearning #3DVision
❤1
✨Fair splits flip the leaderboard: CHANRG reveals limited generalization in RNA secondary-structure prediction
📝 Summary:
The CHANRG benchmark reveals RNA foundation models achieve high held-out accuracy but lose significant robustness out-of-distribution. This new benchmark provides a stricter framework for evaluating RNA secondary structure prediction.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22330
• PDF: https://arxiv.org/pdf/2603.22330
• Project Page: https://huggingface.co/datasets/multimolecule/chanrg
• Github: https://github.com/MultiMolecule/multimolecule
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RNAstructure #MachineLearning #FoundationModels #Bioinformatics #ModelRobustness
📝 Summary:
The CHANRG benchmark reveals RNA foundation models achieve high held-out accuracy but lose significant robustness out-of-distribution. This new benchmark provides a stricter framework for evaluating RNA secondary structure prediction.
🔹 Publication Date: Published on Mar 20
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22330
• PDF: https://arxiv.org/pdf/2603.22330
• Project Page: https://huggingface.co/datasets/multimolecule/chanrg
• Github: https://github.com/MultiMolecule/multimolecule
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RNAstructure #MachineLearning #FoundationModels #Bioinformatics #ModelRobustness
❤1
✨CanViT: Toward Active-Vision Foundation Models
📝 Summary:
CanViT represents the first task- and policy-agnostic Active-Vision Foundation Model that efficiently processes visual scenes through sequential glimpses using a retinotopic Vision Transformer backbon...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22570
• PDF: https://arxiv.org/pdf/2603.22570
• Github: https://github.com/m2b3/CanViT-PyTorch
🔹 Models citing this paper:
• https://huggingface.co/canvit/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
CanViT represents the first task- and policy-agnostic Active-Vision Foundation Model that efficiently processes visual scenes through sequential glimpses using a retinotopic Vision Transformer backbon...
🔹 Publication Date: Published on Mar 23
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22570
• PDF: https://arxiv.org/pdf/2603.22570
• Github: https://github.com/m2b3/CanViT-PyTorch
🔹 Models citing this paper:
• https://huggingface.co/canvit/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1