ML Research Hub
32.9K subscribers
4.49K photos
275 videos
23 files
4.85K links
Advancing research in Machine Learning – practical insights, tools, and techniques for researchers.

Admin: @HusseinSheikho || @Hussein_Sheikho
Download Telegram
Demystifying the Slash Pattern in Attention: The Role of RoPE

📝 Summary:
Slash-Dominant Heads in LLMs emerge when queries and keys are almost rank-one and Rotary Position Embedding has dominant medium-high frequencies. Theoretical proof shows these conditions, combined with gradient descent, explain their emergence.

🔹 Publication Date: Published on Jan 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08297
• PDF: https://arxiv.org/pdf/2601.08297

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
2
M^4olGen: Multi-Agent, Multi-Stage Molecular Generation under Precise Multi-Property Constraints

📝 Summary:
M4olGen is a multi-agent, multi-stage framework for precise molecular generation under multiple physicochemical constraints. It uses fragment-level, retrieval-augmented reasoning and RL-based optimization, outperforming LLMs and graph-based methods.

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10131
• PDF: https://arxiv.org/pdf/2601.10131

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
1
Your Group-Relative Advantage Is Biased

📝 Summary:
Group-based Reinforcement Learning from Verifier Rewards has a biased advantage estimator, underestimating hard prompts and overestimating easy ones. This paper proposes History-Aware Adaptive Difficulty Weighting HA-DW to correct this bias, improving performance on reasoning tasks.

🔹 Publication Date: Published on Jan 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08521
• PDF: https://arxiv.org/pdf/2601.08521

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#ReinforcementLearning #MachineLearning #AIResearch #BiasCorrection #ReasoningTasks
1
RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation

📝 Summary:
This work presents an automated rubric generation framework and RubricHub dataset for open-ended AI generation. RubricHub enables significant performance gains, achieving state-of-the-art results on HealthBench and surpassing GPT-5.

🔹 Publication Date: Published on Jan 13

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08430
• PDF: https://arxiv.org/pdf/2601.08430
• Project Page: https://huggingface.co/datasets/sojuL/RubricHub_v1
• Github: https://github.com/teqkilla/RubricHub

Datasets citing this paper:
https://huggingface.co/datasets/sojuL/RubricHub_v1

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #GenerativeAI #MachineLearning #NLP #Dataset
BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search

📝 Summary:
Reinforcement learning framework for agentic search that improves reliability by teaching agents to recognize reasoning limits and respond appropriately when evidence is insufficient. AI-generated sum...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11037
• PDF: https://arxiv.org/pdf/2601.11037
• Github: https://github.com/Liushiyu-0709/BAPO-Reliable-Search

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection

📝 Summary:
Supervised fine-tuning with multiple references addresses overfitting to non-core expressions by masking low-probability tokens based on their semantic importance. AI-generated summary Supervised fine...

🔹 Publication Date: Published on Jan 14

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09195
• PDF: https://arxiv.org/pdf/2601.09195
• Github: https://github.com/Utaotao/ProFit

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Reasoning Models Generate Societies of Thought

📝 Summary:
Reasoning models demonstrate enhanced performance through multi-agent-like interactions that create diverse cognitive perspectives and improve problem-solving through structured social organization. A...

🔹 Publication Date: Published on Jan 15

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10825
• PDF: https://arxiv.org/pdf/2601.10825

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems

📝 Summary:
Recent advances in agentic Large Language Models (LLMs) have positioned them as generalist planners capable of reasoning and acting across diverse tasks. However, existing agent benchmarks largely foc...

🔹 Publication Date: Published on Jan 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11354
• PDF: https://arxiv.org/pdf/2601.11354
• Github: https://github.com/Mtrya/astro-reason

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #DataScience #MachineLearning #HuggingFace #Research
Monolith: Real Time Recommendation System With Collisionless Embedding Table

📝 Summary:
Monolith is a real-time recommendation system designed for online training. It features a collisionless embedding table with memory optimizations and a fault-tolerant architecture, enabling real-time learning by overcoming limitations of general DL frameworks.

🔹 Publication Date: Published on Sep 16, 2022

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2209.07663
• PDF: https://arxiv.org/pdf/2209.07663
• Github: https://github.com/bytedance/monolith

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#RecommendationSystems #DeepLearning #MachineLearning #RealTimeAI #DataScience
Agent Lightning: Train ANY AI Agents with Reinforcement Learning

📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent, uniquely decoupling execution from training. It uses a hierarchical RL algorithm to handle complex interactions, enabling seamless integration with existing agents and showing stable improvements.

🔹 Publication Date: Published on Aug 5, 2025

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning

==================================

For more data science resources:
https://t.iss.one/DataScienceT

#AI #ReinforcementLearning #LLMs #AIAgents #MachineLearning