ML Research Hub
32.9K subscribers
5.33K photos
332 videos
24 files
5.76K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Multimodal Fact-Level Attribution for Verifiable Reasoning

📝 Summary:
MuRGAt is a benchmark for evaluating fact-level multimodal attribution in complex reasoning tasks, requiring models to provide precise citations for their answers across video, audio, and other modali...

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11509
• PDF: https://arxiv.org/pdf/2602.11509
• Github: https://github.com/meetdavidwan/murgat

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
NarraScore: Bridging Visual Narrative and Musical Dynamics via Hierarchical Affective Control

📝 Summary:
NarraScore is a hierarchical framework for long video soundtracks. It uses frozen Vision-Language Models as affective sensors to distill narrative emotion. A dual injection strategy combines global stability with local modulation for efficient, narratively aligned soundtracks.

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09070
• PDF: https://arxiv.org/pdf/2602.09070

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
T3D: Few-Step Diffusion Language Models via Trajectory Self-Distillation with Direct Discriminative Optimization

📝 Summary:
A trajectory self-distillation framework with direct discriminative optimization improves few-step decoding efficiency in diffusion large language models while maintaining generation quality. AI-gener...

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12262
• PDF: https://arxiv.org/pdf/2602.12262
• Github: https://github.com/Tyrion58/T3D

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
PISCO: Precise Video Instance Insertion with Sparse Control

📝 Summary:
Video diffusion model PISCO enables precise instance insertion with sparse keyframe control through variable-information guidance and distribution-preserving temporal masking. AI-generated summary The...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08277
• PDF: https://arxiv.org/pdf/2602.08277
• Project Page: https://xiangbogaobarry.github.io/PISCO/
• Github: https://github.com/taco-group/PISCO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DeepSight: An All-in-One LM Safety Toolkit

📝 Summary:
DeepSight is an open-source project that integrates safety evaluation and diagnosis for large language and multimodal models, enabling white-box insights through unified protocols and specialized tool...

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12092
• PDF: https://arxiv.org/pdf/2602.12092
• Project Page: https://github.com/AI45Lab/DeepScan/
• Github: https://github.com/AI45Lab/DeepSafe

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Think Longer to Explore Deeper: Learn to Explore In-Context via Length-Incentivized Reinforcement Learning

📝 Summary:
Models require in-context exploration capabilities to scale effectively at test time, but autoregressive generation faces exponential decay in sampling long sequences, which is addressed by a length-i...

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11748
• PDF: https://arxiv.org/pdf/2602.11748
• Github: https://github.com/LINs-lab/LIE

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
DeepGen 1.0: A Lightweight Unified Multimodal Model for Advancing Image Generation and Editing

📝 Summary:
A lightweight 5B unified multimodal model achieves competitive performance through hierarchical feature extraction, learnable think tokens, and progressive training strategies including alignment pre-...

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12205
• PDF: https://arxiv.org/pdf/2602.12205
• Project Page: https://deepgenteam.github.io/
• Github: https://github.com/DeepGenTeam/DeepGen

🔹 Models citing this paper:
https://huggingface.co/deepgenteam/DeepGen-1.0

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
P-GenRM: Personalized Generative Reward Model with Test-time User-based Scaling

📝 Summary:
Personalized generative reward models address challenges in adapting language model responses to individual user preferences by using structured evaluation chains and dual-granularity scaling mechanis...

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12116
• PDF: https://arxiv.org/pdf/2602.12116
• Github: https://github.com/Tongyi-ConvAI/Qwen-Character/tree/main/Character-GenRM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
The biggest surprise for our valued audience: we are offering 40 paid courses completely free.

Enroll Here and request
https://adsly.me/l/jwxfnss0yi

We use a spam/flood protection system to ensure that all registered users are real people.
Detecting RLVR Training Data via Structural Convergence of Reasoning

📝 Summary:
RLVR training induces a detectable behavioral signature where seen prompts yield less diverse generations. A new black-box detector, Min-kNN Distance, quantifies this structural convergence to reliably detect RLVR training data, outperforming existing methods.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11792
• PDF: https://arxiv.org/pdf/2602.11792
• Project Page: https://stevenzhb.github.io/detect-rlvr-data/
• Github: https://github.com/StevenZHB/Detect_RLVR_Data

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #MachineLearning #GenerativeAI #LLMs #DataDetection
Thinking with Drafting: Optical Decompression via Logical Reconstruction

📝 Summary:
Current AI struggles with precise visual reasoning. We propose Thinking with Drafting TwD, a DSL-based approach to decompress visual tokens into logical structures. This generates verifiable visual proofs, making visual generation a logical verifier for robust reasoning.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11731
• PDF: https://arxiv.org/pdf/2602.11731

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #VisualReasoning #ComputerVision #Logic #RobustAI
MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning

📝 Summary:
MetaphorStar is an end-to-end visual reinforcement learning framework that solves AIs challenge in understanding image metaphors. It uses a new dataset, RL method, and benchmark. MetaphorStar achieves state-of-the-art performance, outperforming many MLLMs and improving general visual reasoning.

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10575
• PDF: https://arxiv.org/pdf/2602.10575
• Project Page: https://metaphorstar.github.io/
• Github: https://github.com/MING-ZCH/MetaphorStar

🔹 Models citing this paper:
https://huggingface.co/MING-ZCH/MetaphorStar-32B
https://huggingface.co/MING-ZCH/MetaphorStar-3B
https://huggingface.co/MING-ZCH/MetaphorStar-7B

Datasets citing this paper:
https://huggingface.co/datasets/MING-ZCH/TFQ-Bench-Lite
https://huggingface.co/datasets/MING-ZCH/TFQ-Bench-Full
https://huggingface.co/datasets/MING-ZCH/TFQ-Data-Full

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ReinforcementLearning #ComputerVision #ImageMetaphor #VisualReasoning
Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

📝 Summary:
Composition-RL improves RL by composing multiple easy problems into new, verifiable questions. This enhances model reasoning capabilities, especially with curriculum learning and cross-domain applications.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12036
• PDF: https://arxiv.org/pdf/2602.12036

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #LLMs #PromptEngineering #ArtificialIntelligence #MachineLearning
Media is too big
VIEW IN TELEGRAM
χ_{0}: Resource-Aware Robust Manipulation via Taming Distributional Inconsistencies

📝 Summary:
χ0 is a resource-efficient framework for robust robotic manipulation. It tackles distributional shifts in long-horizon tasks using model arithmetic, stage advantage, and train-deploy alignment. This achieves high-reliability autonomy, surpassing state-of-the-art by 250% in success rate.

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09021
• PDF: https://arxiv.org/pdf/2602.09021
• Project Page: https://mmlab.hk/research/kai0
• Github: https://github.com/OpenDriveLab/KAI0

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #AI #MachineLearning #AutonomousSystems #RobustAI
RISE: Self-Improving Robot Policy with Compositional World Model

📝 Summary:
RISE is a robotic reinforcement learning framework using a compositional world model to predict futures and evaluate imagined outcomes. This allows policy improvement through virtual interactions, avoiding costly physical trials. RISE achieved significant performance gains in challenging real-wor...

🔹 Publication Date: Published on Feb 11

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11075
• PDF: https://arxiv.org/pdf/2602.11075
• Project Page: https://opendrivelab.com/kai0-rl/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#Robotics #ReinforcementLearning #WorldModels #AI #MachineLearning
EgoHumanoid: Unlocking In-the-Wild Loco-Manipulation with Robot-Free Egocentric Demonstration

📝 Summary:
EgoHumanoid enables humanoid loco-manipulation through co-training vision-language-action policies using egocentric human demonstrations and limited robot data, addressing embodiment gaps via view and...

🔹 Publication Date: Published on Feb 10

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.10106
• PDF: https://arxiv.org/pdf/2602.10106

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Sparse Video Generation Propels Real-World Beyond-the-View Vision-Language Navigation

📝 Summary:
Vision-language navigation systems traditionally require detailed instructions but can be improved by incorporating video generation models with sparse future planning for faster, more efficient real-...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.05827
• PDF: https://arxiv.org/pdf/2602.05827
• Project Page: https://opendrivelab.com/SparseVideoNav/
• Github: https://github.com/opendrivelab/sparsevideonav

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Adapting Vision-Language Models for E-commerce Understanding at Scale

📝 Summary:
This paper demonstrates that targeted adaptation of general Vision-Language Models significantly improves e-commerce product understanding while preserving broad multimodal capabilities. A novel evaluation suite for deep product understanding is also proposed.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.11733
• PDF: https://arxiv.org/pdf/2602.11733

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#VisionLanguageModels #EcommerceAI #ProductUnderstanding #DeepLearning #MultimodalAI
ExStrucTiny: A Benchmark for Schema-Variable Structured Information Extraction from Document Images

📝 Summary:
ExStrucTiny is a new benchmark dataset for structured information extraction from document images. It addresses limitations of existing datasets by covering diverse document types and flexible schemas. This aims to improve generalist models for structured information extraction.

🔹 Publication Date: Published on Feb 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.12203
• PDF: https://arxiv.org/pdf/2602.12203

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#InformationExtraction #DocumentAI #MachineLearning #Dataset #ComputerVision
Towards Robust Mathematical Reasoning

📝 Summary:
IMO-Bench introduces advanced math benchmarks including short-answer and proof-writing tasks for foundation models. Gemini Deep Think achieved gold-level IMO 2025 performance using IMO-Bench, showing significant progress in robust mathematical reasoning.

🔹 Publication Date: Published on Nov 3, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.01846
• PDF: https://arxiv.org/pdf/2511.01846
• Project Page: https://imobench.github.io/
• Github: https://github.com/google-deepmind/superhuman

Datasets citing this paper:
https://huggingface.co/datasets/Hwilner/imo-answerbench
https://huggingface.co/datasets/OpenEvals/IMO-AnswerBench
https://huggingface.co/datasets/Hwilner/imo-gradingbench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#MathematicalReasoning #AIBenchmarks #FoundationModels #DeepLearning #IMOBench
2