ML Research Hub
32.3K subscribers
6.74K photos
473 videos
24 files
7.36K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
RAGEN-2: Reasoning Collapse in Agentic RL

📝 Summary:
Research identifies template collapse in multi-turn LLM agents as a hidden failure mode undetectable by entropy, proposing mutual information proxies and SNR-aware filtering to improve reasoning quali...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06268
• PDF: https://arxiv.org/pdf/2604.06268
• Project Page: https://ragen-ai.github.io/v2/
• Github: https://github.com/mll-lab-nu/RAGEN

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Improving Semantic Proximity in Information Retrieval through Cross-Lingual Alignment

📝 Summary:
Multilingual retrieval models exhibit bias toward English documents in mixed-language document pools, which is addressed through a novel training strategy that improves cross-lingual alignment with mi...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05684
• PDF: https://arxiv.org/pdf/2604.05684

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

📝 Summary:
INSPATIO-WORLD presents a real-time framework for generating high-fidelity dynamic scenes from single videos using spatiotemporal autoregressive architecture and joint distribution matching distillati...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07209
• PDF: https://arxiv.org/pdf/2604.07209
• Project Page: https://inspatio.github.io/inspatio-world/
• Github: https://github.com/inspatio/inspatio-world

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
VenusBench-Mobile: A Challenging and User-Centric Benchmark for Mobile GUI Agents with Capability Diagnostics

📝 Summary:
VenusBench-Mobile presents a comprehensive evaluation framework for mobile GUI agents that reveals significant performance gaps compared to existing benchmarks, emphasizing the need for more robust re...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06182
• PDF: https://arxiv.org/pdf/2604.06182
• Github: https://github.com/inclusionAI/UI-Venus/tree/VenusBench-Mobile

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
This media is not supported in your browser
VIEW IN TELEGRAM
FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling

📝 Summary:
A novel two-stage reinforcement learning framework called Sol-RL integrates FP4 quantization with diffusion model alignment to accelerate training while maintaining high-fidelity performance. AI-gener...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06916
• PDF: https://arxiv.org/pdf/2604.06916
• Project Page: https://nvlabs.github.io/Sana/Sol-RL/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Think in Strokes, Not Pixels: Process-Driven Image Generation via Interleaved Reasoning

📝 Summary:
This paper introduces process-driven image generation, an iterative method with interleaved textual and visual reasoning. It decomposes synthesis into planning, drafting, reflecting, and refining steps. Dense step-wise supervision ensures consistency and interpretability of intermediate states.

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04746
• PDF: https://arxiv.org/pdf/2604.04746

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ImageGeneration #GenerativeAI #ArtificialIntelligence #DeepLearning #ComputerVision
TC-AE: Unlocking Token Capacity for Deep Compression Autoencoders

📝 Summary:
TC-AE is a Vision Transformer-based architecture that improves deep compression autoencoders by addressing token space limitations and enhancing semantic structures through joint self-supervised train...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07340
• PDF: https://arxiv.org/pdf/2604.07340
• Github: https://github.com/inclusionAI/TC-AE

🔹 Models citing this paper:
https://huggingface.co/inclusionAI/TC-AE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DeonticBench: A Benchmark for Reasoning over Rules

📝 Summary:
DEONTICBENCH presents a benchmark for evaluating large language models on complex, context-specific deontic reasoning tasks drawn from real-world legal and policy domains, supporting both symbolic and...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04443
• PDF: https://arxiv.org/pdf/2604.04443
• Project Page: https://huggingface.co/datasets/gydou/DeonticBench
• Github: https://github.com/guangyaodou/DeonticBench

Datasets citing this paper:
https://huggingface.co/datasets/gydou/DeonticBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning

📝 Summary:
Research reveals that large language models can perform latent reasoning with varying depths, but there's a gap between discovering and executing multi-step planning strategies, suggesting limitations...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06427
• PDF: https://arxiv.org/pdf/2604.06427

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Q-Zoom: Query-Aware Adaptive Perception for Efficient Multimodal Large Language Models

📝 Summary:
Q-Zoom enhances MLLM performance by adaptively focusing computational resources on relevant visual regions through dynamic gating and self-distilled region proposal networks, achieving faster inferenc...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2604.06912
• PDF: https://arxiv.org/pdf/2604.06912
• Project Page: https://yuhengsss.github.io/Q-Zoom/
• Github: https://yuhengsss.github.io/Q-Zoom/

🔹 Models citing this paper:
https://huggingface.co/YuhengSSS/Q-Zoom-Qwen2.5VL-3B
https://huggingface.co/YuhengSSS/Q-Zoom-Qwen2.5VL-7B
https://huggingface.co/YuhengSSS/Q-Zoom-Qwen3VL-4B

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Fast Spatial Memory with Elastic Test-Time Training

📝 Summary:
Elastic Test-Time Training with fast spatial memory enables efficient 4D reconstruction through multi-chunk adaptation while maintaining stability against catastrophic forgetting. AI-generated summary...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.07350
• PDF: https://arxiv.org/pdf/2604.07350
• Project Page: https://fast-spatial-memory.github.io/
• Github: https://github.com/Mars-tin/fast-spatial-mem

🔹 Models citing this paper:
https://huggingface.co/marstin/fast-spatial-mem

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
FlowInOne:Unifying Multimodal Generation as Image-in, Image-out Flow Matching

📝 Summary:
FlowInOne presents a vision-centric multimodal generation framework that unifies diverse input modalities into a single visual representation, enabling coherent image generation and editing through a ...

🔹 Publication Date: Published on Apr 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06757
• PDF: https://arxiv.org/pdf/2604.06757
• Github: https://csu-jpg.github.io/FlowInOne.github.io/

🔹 Models citing this paper:
https://huggingface.co/CSU-JPG/FlowInOne

Datasets citing this paper:
https://huggingface.co/datasets/CSU-JPG/VisPrompt5M
https://huggingface.co/datasets/CSU-JPG/VPBench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SEVerA: Verified Synthesis of Self-Evolving Agents

📝 Summary:
Formally Guarded Generative Models enable safe and correct agentic code generation by combining formal specifications with soft objectives, ensuring reliability in autonomous agent systems. AI-generat...

🔹 Publication Date: Published on Mar 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.25111
• PDF: https://arxiv.org/pdf/2603.25111
• Github: https://github.com/uiuc-focal-lab/severa

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Beyond Hard Negatives: The Importance of Score Distribution in Knowledge Distillation for Dense Retrieval

📝 Summary:
Stratified sampling improves knowledge distillation by preserving the full range of teacher scores, outperforming traditional sampling methods in retrieval tasks. AI-generated summary Transferring kno...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04734
• PDF: https://arxiv.org/pdf/2604.04734

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Learning to Hint for Reinforcement Learning

📝 Summary:
HiLL is a reinforcement learning framework that adaptively generates hints conditioned on reasoner errors to improve learning signals and transfer performance in group relative policy optimization. AI...

🔹 Publication Date: Published on Apr 1

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.00698
• PDF: https://arxiv.org/pdf/2604.00698
• Github: https://github.com/Andree-9/HiLL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Tunable Soft Equivariance with Guarantees

📝 Summary:
A general framework for constructing soft equivariant models through weight projection into designed subspaces is proposed, demonstrating improved performance and reduced equivariance error across mul...

🔹 Publication Date: Published on Mar 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2603.26657
• PDF: https://arxiv.org/pdf/2603.26657
• Github: https://github.com/ashiq24/soft-equivariance

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

📝 Summary:
DeltaTok encodes visual feature differences as delta tokens and DeltaWorld generates diverse video futures with reduced parameters and computational cost through multi-hypothesis training. AI-generate...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.04913
• PDF: https://arxiv.org/pdf/2604.04913
• Project Page: https://deltatok.github.io
• Github: https://huggingface.co/collections/Amazon-FAR/deltatok

🔹 Models citing this paper:
https://huggingface.co/Amazon-FAR/deltatok-kinetics
https://huggingface.co/Amazon-FAR/deltaworld-kinetics
https://huggingface.co/Amazon-FAR/seg-head-vspw

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
R3PM-Net: Real-time, Robust, Real-world Point Matching Network

📝 Summary:
R3PM-Net is a lightweight, global-aware point matching network that achieves high-speed and accurate point cloud registration with competitive performance on real-world datasets. AI-generated summary ...

🔹 Publication Date: Published on Apr 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05060
• PDF: https://arxiv.org/pdf/2604.05060
• Project Page: https://yasiikb.github.io/R3PM-Net/
• Github: https://github.com/YasiiKB/R3PM-Net

Datasets citing this paper:
https://huggingface.co/datasets/YasiiKB/R3PM-Net

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Qualixar OS: A Universal Operating System for AI Agent Orchestration

📝 Summary:
Qualixar OS enables universal AI agent orchestration through a comprehensive runtime environment supporting diverse LLM providers, agent frameworks, and communication protocols, featuring advanced mul...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.06392
• PDF: https://arxiv.org/pdf/2604.06392

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgentGL: Towards Agentic Graph Learning with LLMs via Reinforcement Learning

📝 Summary:
AgentGL is a reinforcement learning-driven framework that enables large language models to navigate and reason over complex relational data by integrating graph-native tools and curriculum learning st...

🔹 Publication Date: Published on Apr 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.05846
• PDF: https://arxiv.org/pdf/2604.05846
• Github: https://github.com/sunyuanfu/AgentGL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
A Systematic Study of Cross-Modal Typographic Attacks on Audio-Visual Reasoning

📝 Summary:
Multi-modal typography attacks demonstrate significantly higher success rates than unimodal attacks by exploiting cross-modal vulnerabilities in audio-visual multi-modal large language models. AI-gene...

🔹 Publication Date: Published on Apr 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2604.03995
• PDF: https://arxiv.org/pdf/2604.03995
• Project Page: https://cskyl.github.io/MLLM-Typography/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research