ML Research Hub
32.9K subscribers
5.48K photos
348 videos
24 files
5.93K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Large Language Model Reasoning Failures

📝 Summary:
This paper surveys reasoning failures in large language models, proposing a novel categorization. It classifies failures into embodied and non-embodied types, and further into fundamental, application-specific, and robustness issues. The work unifies research to guide future efforts for stronger ...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06176
• PDF: https://arxiv.org/pdf/2602.06176

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
SPARC: Separating Perception And Reasoning Circuits for Test-time Scaling of VLMs

📝 Summary:
SPARC decouples visual perception and reasoning in VLMs using a two-stage pipeline. This enables efficient test-time scaling with targeted compute allocation, significantly improving visual reasoning performance and reducing token budget compared to monolithic baselines.

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06566
• PDF: https://arxiv.org/pdf/2602.06566

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Outcome Accuracy is Not Enough: Aligning the Reasoning Process of Reward Models

📝 Summary:
Generative Reward Models suffer from deceptive alignment when prioritizing outcome accuracy. Introducing Rationale Consistency, a metric aligning reasoning with human judgment, and a hybrid training signal improves performance, avoids deceptive alignment, and boosts RLHF.

🔹 Publication Date: Published on Feb 4

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.04649
• PDF: https://arxiv.org/pdf/2602.04649
• Github: https://github.com/QwenLM/RationaleRM

Datasets citing this paper:
https://huggingface.co/datasets/Qwen/RationaleRM

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Uncertainty Drives Social Bias Changes in Quantized Large Language Models

📝 Summary:
Post-training quantization of large language models causes significant changes in social biases that aggregate metrics fail to detect, with quantization-induced masked bias flipping occurring more fre...

🔹 Publication Date: Published on Feb 5

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06181
• PDF: https://arxiv.org/pdf/2602.06181

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

📝 Summary:
TP-GRPO enhances GRPO for flow matching by using step-level incremental rewards instead of outcome-based ones. It also identifies turning points in denoising trajectories to capture and aggregate long-term effects. This improves reward signal effectiveness and consistently enhances generation qua...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06422
• PDF: https://arxiv.org/pdf/2602.06422
• Github: https://github.com/YunzeTong/TurningPoint-GRPO

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

📝 Summary:
NanoQuant enables efficient post-training quantization of large language models to binary and sub-1-bit levels using low-rank binary factorization and ADMM optimization, achieving state-of-the-art acc...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06694
• PDF: https://arxiv.org/pdf/2602.06694

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
RelayGen: Intra-Generation Model Switching for Efficient Reasoning

📝 Summary:
RelayGen is a training-free framework that dynamically switches between large and small models during reasoning by identifying difficulty transitions at the segment level, achieving faster inference w...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06454
• PDF: https://arxiv.org/pdf/2602.06454
• Github: https://github.com/jiwonsong-dev/RelayGen

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Modality Gap-Driven Subspace Alignment Training Paradigm For Multimodal Large Language Models

📝 Summary:
Researchers address the modality gap in multimodal learning by proposing a fixed-frame theory and a training-free alignment method that enables efficient scaling of multimodal models using unpaired da...

🔹 Publication Date: Published on Feb 2

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07026
• PDF: https://arxiv.org/pdf/2602.07026

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ECO: Energy-Constrained Optimization with Reinforcement Learning for Humanoid Walking

📝 Summary:
Energy-constrained optimization framework separates energy metrics from rewards using Lagrangian method to achieve stable, energy-efficient humanoid robot locomotion with reduced hyperparameter tuning...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06445
• PDF: https://arxiv.org/pdf/2602.06445

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Cybersecurity AI: Humanoid Robots as Attack Vectors

📝 Summary:
The Unitree G1 humanoid robot is vulnerable to BLE provisioning protocol exploits, exfiltrates sensor data, and can be repurposed for active cyber operations, highlighting the need for improved securi...

🔹 Publication Date: Published on Sep 17, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2509.14139
• PDF: https://arxiv.org/pdf/2509.14139
• Project Page: https://aliasrobotics.com
• Github: https://github.com/aliasrobotics/cai

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Media is too big
VIEW IN TELEGRAM
Towards Bridging the Gap between Large-Scale Pretraining and Efficient Finetuning for Humanoid Control

📝 Summary:
Off-policy Soft Actor-Critic with large-batch updates enables efficient humanoid locomotion policy pretraining, while model-based methods facilitate safe adaptation through deterministic data collecti...

🔹 Publication Date: Published on Jan 29

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.21363
• PDF: https://arxiv.org/pdf/2601.21363
• Github: https://github.com/bigai-ai/LIFT-humanoid

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LOCA-bench: Benchmarking Language Agents Under Controllable and Extreme Context Growth

📝 Summary:
LOCA-bench is a new benchmark for evaluating language agents in long context, agentic scenarios with controlled environment state growth. It assesses how models and context management strategies perform as context extends, finding that advanced techniques significantly improve success rates.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07962
• PDF: https://arxiv.org/pdf/2602.07962
• Github: https://github.com/hkust-nlp/LOCA-bench

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

📝 Summary:
LatentChem enables chemical reasoning through continuous latent space computations instead of discrete textual tokens, achieving superior performance and efficiency compared to traditional chain-of-th...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07075
• PDF: https://arxiv.org/pdf/2602.07075

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

📝 Summary:
AgentCPM-Report presents a lightweight local solution for deep research report generation using a Writing As Reasoning Policy framework and multi-stage agentic training to enhance small models' reason...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.06540
• PDF: https://arxiv.org/pdf/2602.06540
• Github: https://github.com/OpenBMB/AgentCPM

🔹 Models citing this paper:
https://huggingface.co/openbmb/AgentCPM-Report
https://huggingface.co/openbmb/AgentCPM-Report-GGUF

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning

📝 Summary:
Adaptive test-time framework with world models enables selective visual imagination for spatial reasoning, improving efficiency and reliability by determining when imagination is necessary. AI-generat...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08236
• PDF: https://arxiv.org/pdf/2602.08236
• Project Page: https://adaptive-visual-tts.github.io/
• Github: https://adaptive-visual-tts.github.io/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

📝 Summary:
RD-VLA introduces a recurrent architecture for VLA models, using latent iterative refinement for adaptive compute. It maintains constant memory, boosts success on complex tasks, and offers significant speedups.

🔹 Publication Date: Published on Feb 8

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07845
• PDF: https://arxiv.org/pdf/2602.07845
• Project Page: https://rd-vla.github.io/
• Github: https://github.com/rd-vla/rd-vla

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

📝 Summary:
Researchers introduce a new video understanding task and benchmark that evaluates models' ability to learn from few-shot demonstrations, along with a specialized MLLM architecture trained using a two-...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08439
• PDF: https://arxiv.org/pdf/2602.08439
• Github: https://github.com/dongyh20/Demo-ICL

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

📝 Summary:
Financial markets are noisy and non-stationary, making alpha mining highly sensitive to noise in backtesting results and sudden market regime shifts. While recent agentic frameworks improve alpha mini...

🔹 Publication Date: Published on Feb 6

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.07085
• PDF: https://arxiv.org/pdf/2602.07085
• Github: https://github.com/QuantaAlpha/QuantaAlpha

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
LLaDA2.1: Speeding Up Text Diffusion via Token Editing

📝 Summary:
LLaDA2.1 introduces a novel token-to-token editing approach with speed and quality modes, enhanced through reinforcement learning for improved reasoning and instruction following in large language dif...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08676
• PDF: https://arxiv.org/pdf/2602.08676
• Github: https://github.com/inclusionAI/LLaDA2.X

🔹 Models citing this paper:
https://huggingface.co/inclusionAI/LLaDA2.1-mini
https://huggingface.co/inclusionAI/LLaDA2.1-flash

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
WorldCompass: Reinforcement Learning for Long-Horizon World Models

📝 Summary:
WorldCompass enhances long-horizon video-based world models through reinforcement learning post-training with clip-level rollouts, complementary rewards, and efficient RL algorithms. AI-generated summ...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.09022
• PDF: https://arxiv.org/pdf/2602.09022
• Project Page: https://3d-models.hunyuan.tencent.com/world/

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
WildReward: Learning Reward Models from In-the-Wild Human Interactions

📝 Summary:
WildReward demonstrates that reward models can be effectively trained from in-the-wild user interactions using ordinal regression, achieving performance comparable to traditional methods while benefit...

🔹 Publication Date: Published on Feb 9

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2602.08829
• PDF: https://arxiv.org/pdf/2602.08829

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research