ML Research Hub
32.9K subscribers
5.3K photos
328 videos
24 files
5.72K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

📝 Summary:
Agent-Omit is a training framework that enables LLM agents to adaptively omit redundant thoughts and observations during multi-turn interactions, achieving superior effectiveness-efficiency trade-offs...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04284
• PDF: https://arxiv.org/pdf/2602.04284
• Project Page: https://github.com/usail-hkust/Agent-Omit
• Github: https://github.com/usail-hkust/Agent-Omit

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Horizon-LM: A RAM-Centric Architecture for LLM Training

📝 Summary:
Horizon-LM enables large-model training on single GPUs by redefining CPU-GPU roles and eliminating persistent GPU memory usage through explicit recomputation and pipelined execution. AI-generated summ...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04816
• PDF: https://arxiv.org/pdf/2602.04816
• Github: https://github.com/DLYuanGod/Horizon-LM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

📝 Summary:
Multi-agent systems using reinforcement learning enable parallel information seeking with scalable orchestration, achieving performance comparable to larger single agents. AI-generated summary Recent ...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04634
• PDF: https://arxiv.org/pdf/2602.04634
• Project Page: https://wideseek-r1.github.io/

🔹 Models citing this paper:
https://huggingface.co/RLinf/WideSeek-R1-4b

Datasets citing this paper:
https://huggingface.co/datasets/RLinf/WideSeek-R1-train-data
https://huggingface.co/datasets/RLinf/WideSeek-R1-Corpus

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HySparse: A Hybrid Sparse Attention Architecture with Oracle Token Selection and KV Cache Sharing

📝 Summary:
Hybrid Sparse Attention architecture interleaves full and sparse attention layers, using full attention output to guide sparse layer token selection and cache reuse for improved efficiency and perform...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03560
• PDF: https://arxiv.org/pdf/2602.03560

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Skin Tokens: A Learned Compact Representation for Unified Autoregressive Rigging

📝 Summary:
Generative 3D models face challenges in animation rigging, which this work addresses by introducing SkinTokens—a learned discrete representation for skinning weights—and TokenRig, a unified autoregres...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04805
• PDF: https://arxiv.org/pdf/2602.04805
• Project Page: https://zjp-shadow.github.io/works/SkinTokens/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
HY3D-Bench: Generation of 3D Assets

📝 Summary:
HY3D-Bench presents an open-source ecosystem for 3D content creation that provides high-fidelity 3D objects and synthetic assets to advance 3D generation capabilities. AI-generated summary While recen...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03907
• PDF: https://arxiv.org/pdf/2602.03907
• Project Page: https://3d.hunyuan.tencent.com/login?redirect_url=https%3A%2F%2F3d.hunyuan.tencent.com%2F

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
TIDE: Trajectory-based Diagnostic Evaluation of Test-Time Improvement in LLM Agents

📝 Summary:
Test-Time Improvement (TTI) in autonomous LLM agents involves iterative environmental interaction that enhances performance, but current evaluation methods inadequately capture task optimization effic...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02196
• PDF: https://arxiv.org/pdf/2602.02196

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Protein Autoregressive Modeling via Multiscale Structure Generation

📝 Summary:
PAR is a multi-scale autoregressive framework for protein backbone generation that uses hierarchical structure modeling, autoregressive transformers, and flow-based decoding to produce high-quality pr...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04883
• PDF: https://arxiv.org/pdf/2602.04883

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ACE-Step: A Step Towards Music Generation Foundation Model

📝 Summary:
ACE-Step is an open-source music generation model that integrates diffusion generation with a lightweight transformer and deep compression autoencoder, achieving fast inference, high coherence, and fi...

🔹 Publication Date: Published on May 28, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2506.00045
• PDF: https://arxiv.org/pdf/2506.00045
• Github: https://github.com/ace-step/ACE-Step

Spaces citing this paper:
https://huggingface.co/spaces/DengLi1208/ACE-Step-1.5

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
ERNIE 5.0 Technical Report

📝 Summary:
ERNIE 5.0 is a production-scale trillion-parameter autoregressive model that unifies multimodal understanding and generation through sparse MoE architecture and elastic training. AI-generated summary ...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04705
• PDF: https://arxiv.org/pdf/2602.04705

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
From Data to Behavior: Predicting Unintended Model Behaviors Before Training

📝 Summary:
Data2Behavior predicts unintended model behaviors before training using MDF, a lightweight method that analyzes data features to reveal potential biases without parameter updates. AI-generated summary...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04735
• PDF: https://arxiv.org/pdf/2602.04735
• Github: https://github.com/zjunlp/Data2Behavior

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
OmniSIFT: Modality-Asymmetric Token Compression for Efficient Omni-modal Large Language Models

📝 Summary:
OmniSIFT is a modality-asymmetric token compression framework for Omni-LLMs that reduces computational overhead through spatio-temporal video pruning and vision-guided audio selection while maintainin...

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04804
• PDF: https://arxiv.org/pdf/2602.04804

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
VLS: Steering Pretrained Robot Policies via Vision-Language Models

📝 Summary:
Pretrained diffusion and flow-matching policies fail under test-time shifts due to tight coupling with training configurations, prompting the development of Vision-Language Steering (VLS) for training...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03973
• PDF: https://arxiv.org/pdf/2602.03973
• Project Page: https://vision-language-steering.github.io/webpage/
• Github: https://github.com/Vision-Language-Steering/code

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Quantifying the Gap between Understanding and Generation within Unified Multimodal Models

📝 Summary:
Unified multimodal models exhibit a persistent gap between understanding and generation capabilities, indicating only surface-level integration rather than deep cognitive convergence. AI-generated sum...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02140
• PDF: https://arxiv.org/pdf/2602.02140

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Likelihood-Based Reward Designs for General LLM Reasoning

📝 Summary:
Log-probability rewards derived from the reference answer's likelihood outperform binary rewards in chain-of-thought fine-tuning across both verifiable and non-verifiable reasoning benchmarks. AI-gene...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03979
• PDF: https://arxiv.org/pdf/2602.03979

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Semantic Routing: Exploring Multi-Layer LLM Feature Weighting for Diffusion Transformers

📝 Summary:
This paper introduces depth-wise semantic routing to fuse multi-layer LLM hidden states, enhancing text conditioning in DiT models. It significantly improves text-image alignment and compositional generation. Time-wise fusion can degrade results due to trajectory mismatch.

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03510
• PDF: https://arxiv.org/pdf/2602.03510

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
MEnvAgent: Scalable Polyglot Environment Construction for Verifiable Software Engineering

📝 Summary:
MEnvAgent is a multi-language framework that automates environment construction for software engineering tasks using a planning-execution-verification architecture and environment reuse mechanism, ach...

🔹 Publication Date: Published on Jan 30

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.22859
• PDF: https://arxiv.org/pdf/2601.22859

Datasets citing this paper:
https://huggingface.co/datasets/ernie-research/MEnvBench
https://huggingface.co/datasets/ernie-research/MEnvData-SWE
https://huggingface.co/datasets/ernie-research/MEnvData-SWE-Trajectory

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Efficient Autoregressive Video Diffusion with Dummy Head

📝 Summary:
Autoregressive video diffusion models underutilize historical frames. Dummy Forcing improves efficiency through heterogeneous memory allocation and dynamic head programming. This method achieves up to 2.0x speedup with less than 0.5% quality drop, enabling faster video generation.

🔹 Publication Date: Published on Jan 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.20499
• PDF: https://arxiv.org/pdf/2601.20499
• Project Page: https://csguoh.github.io/project/DummyForcing/
• Github: https://github.com/csguoh/DummyForcing

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoDiffusion #AutoregressiveModels #GenerativeAI #DeepLearning #AI
OmniRad: A Radiological Foundation Model for Multi-Task Medical Image Analysis

📝 Summary:
OmniRad is a self-supervised radiological foundation model pretrained on 1.2 million medical images. It improves classification F1 by 2.05 percent and achieves better segmentation through representation reuse and cross-task transferability.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04547
• PDF: https://arxiv.org/pdf/2602.04547
• Github: https://github.com/unica-visual-intelligence-lab/OmniRad

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MedicalAI #FoundationModels #Radiology #SelfSupervisedLearning #MedicalImaging
LongVPO: From Anchored Cues to Self-Reasoning for Long-Form Video Preference Optimization

📝 Summary:
LongVPO is a two-stage DPO framework for short-context VLMs to understand long videos. It uses synthetic preference data from anchored clips and recursive captioning for multi-segment reasoning. LongVPO achieves state-of-the-art with minimal human annotation.

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.02341
• PDF: https://arxiv.org/pdf/2602.02341
• Github: https://github.com/MCG-NJU/LongVPO

🔹 Models citing this paper:
https://huggingface.co/MCG-NJU/LongVPO-Stage2-InternVL3-8B
https://huggingface.co/MCG-NJU/LongVPO-Stage1-InternVL3-8B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VideoUnderstanding #MachineLearning #VLMs #DeepLearning #AIResearch
SpatiaLab: Can Vision-Language Models Perform Spatial Reasoning in the Wild?

📝 Summary:
SpatiaLab introduces a comprehensive benchmark to evaluate vision language model spatial reasoning in realistic scenarios. Experiments show a significant performance gap between current models and humans, revealing major limitations in tasks like depth and 3D geometry. This highlights challenges ...

🔹 Publication Date: Published on Feb 3

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.03916
• PDF: https://arxiv.org/pdf/2602.03916
• Project Page: https://spatialab-reasoning.github.io/
• Github: https://github.com/SpatiaLab-Reasoning/SpatiaLab

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisionLanguageModels #SpatialReasoning #ComputerVision #AIResearch #DeepLearning