ML Research Hub – Telegram

ML Research Hub

32.4K subscribers

6.15K photos

404 videos

24 files

6.66K links

Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho

Download Telegram

About

Blog

Apps

Platform

ML Research Hub

32.4K subscribers

ML Research Hub

✨Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

📝 Summary:
Ego2Web introduces the first benchmark bridging egocentric video perception and web agent execution, enabling evaluation of AI agents that can perceive physical surroundings and perform online tasks t...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22529
• PDF: https://arxiv.org/pdf/2603.22529
• Project Page: https://ego2web.github.io/
• Github: https://ego2web.github.io/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

98 views03:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨MinerU-Diffusion: Rethinking Document OCR as Inverse Rendering via Diffusion Decoding

📝 Summary:
MinerU-Diffusion is a diffusion-based framework that replaces autoregressive decoding with parallel diffusion denoising for document OCR, improving robustness and decoding speed. AI-generated summary ...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22458
• PDF: https://arxiv.org/pdf/2603.22458

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

87 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Sparse but Critical: A Token-Level Analysis of Distributional Shifts in RLVR Fine-Tuning of LLMs

📝 Summary:
Reinforcement learning with verifiable rewards induces sparse, targeted changes in token distributions that can be systematically analyzed through distributional shifts and cross-sampling intervention...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22446
• PDF: https://arxiv.org/pdf/2603.22446
• Project Page: https://qwen-pilot.notion.site/rlvr-theseus

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

122 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨RealMaster: Lifting Rendered Scenes into Photorealistic Video

📝 Summary:
RealMaster combines video diffusion models with 3D engine outputs to generate photorealistic videos that maintain geometric accuracy and scene consistency through paired training and IC-LoRA distillat...

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23462
• PDF: https://arxiv.org/pdf/2603.23462
• Project Page: https://danacohen95.github.io/RealMaster/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

148 views03:03

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Uncertainty-guided Compositional Alignment with Part-to-Whole Semantic Representativeness in Hyperbolic Vision-Language Models

📝 Summary:
Hyperbolic vision-language models are enhanced through uncertainty-guided compositional alignment that improves hierarchical structure representation and multi-object scene understanding. AI-generated...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22042
• PDF: https://arxiv.org/pdf/2603.22042
• Project Page: https://jeeit17.github.io/UNCHA-project_page/
• Github: https://github.com/jeeit17/UNCHA

🔹 Models citing this paper:
• https://huggingface.co/hayeonkim/uncha

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

149 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

✨SIMART: Decomposing Monolithic Meshes into Sim-ready Articulated Assets via MLLM

📝 Summary:
SIMART is a unified MLLM that generates sim-ready articulated 3D assets by jointly decomposing parts and predicting kinematics. Its Sparse 3D VQ-VAE significantly reduces 3D token overhead, enabling high-fidelity multi-part assemblies for physics simulation.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2603.23386
• PDF: https://arxiv.org/pdf/2603.23386
• Project Page: https://simart-mllm.github.io/
• Github: https://simart-mllm.github.io/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

170 views04:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Reasoning or Rhetoric? An Empirical Analysis of Moral Reasoning Explanations in Large Language Models

📝 Summary:
Large language models exhibit post-conventional moral reasoning patterns inconsistent with human developmental trajectories, showing systematic logical incoherence and rhetorical sophistication withou...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21854
• PDF: https://arxiv.org/pdf/2603.21854

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

188 views05:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VTAM: Video-Tactile-Action Models for Complex Physical Interaction Beyond VLAs

📝 Summary:
Video-Action Models struggle in contact-rich tasks as vision alone lacks fine force details. The Video-Tactile Action Model VTAM integrates tactile perception with visual streams via multimodal fusion. VTAM significantly improves contact-rich manipulation by correcting visual errors, enabling rob...

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23481
• PDF: https://arxiv.org/pdf/2603.23481
• Project Page: https://plan-lab.github.io/projects/vtam

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

165 views06:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Session Risk Memory (SRM): Temporal Authorization for Deterministic Pre-Execution Safety Gates

📝 Summary:
Session Risk Memory SRM enhances authorization by evaluating agent behavior over time, addressing distributed attacks. It uses semantic centroids and risk accumulation to achieve perfect detection with zero false positives, eliminating issues with stateless systems.

🔹 Publication Date: Published on Mar 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22350
• PDF: https://arxiv.org/pdf/2603.22350

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#Cybersecurity #TemporalAuthorization #DistributedSystems #BehavioralAnalytics #RiskDetection

143 views07:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

Media is too big

VIEW IN TELEGRAM

✨2Xplat: Two Experts Are Better Than One Generalist

📝 Summary:
2Xplat proposes a two-expert architecture for pose-free 3D Gaussian Splatting. It explicitly separates geometry estimation from appearance synthesis, outperforming unified methods and matching state-of-the-art performance with less training.

🔹 Publication Date: Published on Mar 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.21064
• PDF: https://arxiv.org/pdf/2603.21064
• Project Page: https://hwasikjeong.github.io/2Xplat
• Github: https://github.com/HwasikJeong/2Xplat

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#GaussianSplatting #3DReconstruction #ComputerVision #AI #DeepLearning

157 views07:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨DA-Flow: Degradation-Aware Optical Flow Estimation with Diffusion Models

📝 Summary:
Traditional optical flow models fail on corrupted real-world videos. This paper introduces DA-Flow, a new method that leverages corruption-aware features from spatio-temporally enhanced diffusion models. Fusing these with convolutional features, DA-Flow significantly improves performance on degra...

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23499
• PDF: https://arxiv.org/pdf/2603.23499
• Project Page: https://cvlab-kaist.github.io/DA-Flow/
• Github: https://github.com/cvlab-kaist/DA-Flow

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#OpticalFlow #DiffusionModels #ComputerVision #DeepLearning #AI

162 views08:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VP-VLA: Visual Prompting as an Interface for Vision-Language-Action Models

📝 Summary:
VP-VLA is a dual-system framework that separates high-level task planning from low-level robotic control. It uses visual prompts like bounding boxes to guide the controller, improving spatial precision and robustness in vision-language-action tasks. This approach outperforms existing VLA models.

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22003
• PDF: https://arxiv.org/pdf/2603.22003

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VisionLanguageAction #Robotics #VisualPrompting #AIResearch #MachineLearning

175 views08:05

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Regulating AI Agents

📝 Summary:
The EU AI Act struggles to regulate autonomous AI agents due to gaps in its framework. This paper analyzes the Act's provisions and institutional setups, finding them ill-suited for these new systems. Policymakers must adapt to effectively govern next generation AI technology.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23471
• PDF: https://arxiv.org/pdf/2603.23471

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AILaw #AIGovernance #EUAIACT #AutonomousAI #TechPolicy

184 views09:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨VISion On Request: Enhanced VLLM efficiency with sparse, dynamically selected, vision-language interactions

📝 Summary:
VISOR improves LVLM efficiency by sparsifying image-text interactions using strategically placed, dynamic attention layers. This allows high-resolution reasoning on demand, significantly reducing computational cost while matching state-of-the-art performance on complex visual tasks.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23495
• PDF: https://arxiv.org/pdf/2603.23495

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VLLM #VisionLanguageAI #AIEfficiency #DeepLearning #AIResearch

219 views09:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

This media is not supported in your browser

VIEW IN TELEGRAM

✨InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

📝 Summary:
InfiniDepth introduces neural implicit fields for continuous 2D depth querying, overcoming limitations of discrete grid methods. This enables arbitrary-resolution and fine-grained depth estimation, achieving state-of-the-art performance, particularly in fine-detail regions and for novel view synt...

🔹 Publication Date: Published on Jan 6

🔹 Paper Links:
• arXiv Page: https://arxivexplained.com/papers/infinidepth-arbitrary-resolution-and-fine-grained-depth-estimation-with-neural-implicit-fields
• PDF: https://arxiv.org/pdf/2601.03252
• Project Page: https://zju3dv.github.io/InfiniDepth
• Github: https://zju3dv.github.io/InfiniDepth

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#DepthEstimation #NeuralImplicitFields #ComputerVision #AI #3DGraphics

❤1

219 views10:06

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨STEM Agent: A Self-Adapting, Tool-Enabled, Extensible Architecture for Multi-Protocol AI Agent Systems

📝 Summary:
STEM Agent is a self-adapting, modular AI architecture. Inspired by biology, it dynamically differentiates components for diverse interaction protocols, tool integration, and user modeling, solving fixed framework limitations.

🔹 Publication Date: Published on Mar 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22359
• PDF: https://arxiv.org/pdf/2603.22359

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AIAgents #AIArchitecture #AdaptiveAI #ToolIntegration #AIResearch

197 views11:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Reconstruction-Guided Slot Curriculum: Addressing Object Over-Fragmentation in Video Object-Centric Learning

📝 Summary:
SlotCurri addresses video object over-fragmentation using a reconstruction-guided slot curriculum. It progressively allocates slots, employs a structure-aware loss for sharp boundaries, and uses cyclic inference for temporal consistency. This method significantly improves object decomposition.

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22758
• PDF: https://arxiv.org/pdf/2603.22758
• Github: https://github.com/wjun0830/SlotCurri

🔹 Models citing this paper:
• https://huggingface.co/WJ0830/SlotCurri

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#VideoAI #ObjectCentricLearning #ComputerVision #DeepLearning #ObjectSegmentation

201 views12:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Logics-Parsing Technical Report

📝 Summary:
Logics-Parsing is an end-to-end LVLM enhanced with reinforcement learning to improve document parsing. It optimizes layout analysis and reading order inference, achieving state-of-the-art performance on diverse document types across a new benchmark.

🔹 Publication Date: Published on Sep 24, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.19760
• PDF: https://arxiv.org/pdf/2509.19760
• Github: https://github.com/alibaba/Logics-Parsing

🔹 Models citing this paper:
• https://huggingface.co/Logics-MLLM/Logics-Parsing
• https://huggingface.co/Mungert/Logics-Parsing-GGUF

✨ Spaces citing this paper:
• https://huggingface.co/spaces/prithivMLmods/VLM-Parsing

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

253 views12:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨One View Is Enough! Monocular Training for In-the-Wild Novel View Generation

📝 Summary:
OVIE enables monocular novel-view synthesis from single images by generating pseudo-target views via a geometric scaffold. This eliminates the need for multi-view supervision, allowing training on massive unpaired datasets. OVIE achieves superior zero-shot performance and is significantly faster ...

🔹 Publication Date: Published on Mar 24

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.23488
• PDF: https://arxiv.org/pdf/2603.23488
• Github: https://github.com/AdrienRR/ovie

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#NovelViewSynthesis #MonocularVision #ComputerVision #DeepLearning #3DVision

❤1

231 views14:07

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Fair splits flip the leaderboard: CHANRG reveals limited generalization in RNA secondary-structure prediction

📝 Summary:
The CHANRG benchmark reveals RNA foundation models achieve high held-out accuracy but lose significant robustness out-of-distribution. This new benchmark provides a stricter framework for evaluating RNA secondary structure prediction.

🔹 Publication Date: Published on Mar 20

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22330
• PDF: https://arxiv.org/pdf/2603.22330
• Project Page: https://huggingface.co/datasets/multimolecule/chanrg
• Github: https://github.com/MultiMolecule/multimolecule

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#RNAstructure #MachineLearning #FoundationModels #Bioinformatics #ModelRobustness

❤1

188 views16:08

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨CanViT: Toward Active-Vision Foundation Models

📝 Summary:
CanViT represents the first task- and policy-agnostic Active-Vision Foundation Model that efficiently processes visual scenes through sequential glimpses using a retinotopic Vision Transformer backbon...

🔹 Publication Date: Published on Mar 23

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.22570
• PDF: https://arxiv.org/pdf/2603.22570
• Github: https://github.com/m2b3/CanViT-PyTorch

🔹 Models citing this paper:
• https://huggingface.co/canvit/canvitb16-add-vpe-pretrain-g128px-s512px-in21k-dv3b16-2026-02-02

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research

❤1

212 views16:08

✨ Explore Data Science 📝 Write your paper