✨Demystifying the Slash Pattern in Attention: The Role of RoPE
📝 Summary:
Slash-Dominant Heads in LLMs emerge when queries and keys are almost rank-one and Rotary Position Embedding has dominant medium-high frequencies. Theoretical proof shows these conditions, combined with gradient descent, explain their emergence.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08297
• PDF: https://arxiv.org/pdf/2601.08297
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Slash-Dominant Heads in LLMs emerge when queries and keys are almost rank-one and Rotary Position Embedding has dominant medium-high frequencies. Theoretical proof shows these conditions, combined with gradient descent, explain their emergence.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08297
• PDF: https://arxiv.org/pdf/2601.08297
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤2
✨M^4olGen: Multi-Agent, Multi-Stage Molecular Generation under Precise Multi-Property Constraints
📝 Summary:
M4olGen is a multi-agent, multi-stage framework for precise molecular generation under multiple physicochemical constraints. It uses fragment-level, retrieval-augmented reasoning and RL-based optimization, outperforming LLMs and graph-based methods.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10131
• PDF: https://arxiv.org/pdf/2601.10131
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
M4olGen is a multi-agent, multi-stage framework for precise molecular generation under multiple physicochemical constraints. It uses fragment-level, retrieval-augmented reasoning and RL-based optimization, outperforming LLMs and graph-based methods.
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10131
• PDF: https://arxiv.org/pdf/2601.10131
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
❤1
✨Your Group-Relative Advantage Is Biased
📝 Summary:
Group-based Reinforcement Learning from Verifier Rewards has a biased advantage estimator, underestimating hard prompts and overestimating easy ones. This paper proposes History-Aware Adaptive Difficulty Weighting HA-DW to correct this bias, improving performance on reasoning tasks.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08521
• PDF: https://arxiv.org/pdf/2601.08521
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #MachineLearning #AIResearch #BiasCorrection #ReasoningTasks
📝 Summary:
Group-based Reinforcement Learning from Verifier Rewards has a biased advantage estimator, underestimating hard prompts and overestimating easy ones. This paper proposes History-Aware Adaptive Difficulty Weighting HA-DW to correct this bias, improving performance on reasoning tasks.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08521
• PDF: https://arxiv.org/pdf/2601.08521
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#ReinforcementLearning #MachineLearning #AIResearch #BiasCorrection #ReasoningTasks
❤1
✨RubricHub: A Comprehensive and Highly Discriminative Rubric Dataset via Automated Coarse-to-Fine Generation
📝 Summary:
This work presents an automated rubric generation framework and RubricHub dataset for open-ended AI generation. RubricHub enables significant performance gains, achieving state-of-the-art results on HealthBench and surpassing GPT-5.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08430
• PDF: https://arxiv.org/pdf/2601.08430
• Project Page: https://huggingface.co/datasets/sojuL/RubricHub_v1
• Github: https://github.com/teqkilla/RubricHub
✨ Datasets citing this paper:
• https://huggingface.co/datasets/sojuL/RubricHub_v1
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #GenerativeAI #MachineLearning #NLP #Dataset
📝 Summary:
This work presents an automated rubric generation framework and RubricHub dataset for open-ended AI generation. RubricHub enables significant performance gains, achieving state-of-the-art results on HealthBench and surpassing GPT-5.
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08430
• PDF: https://arxiv.org/pdf/2601.08430
• Project Page: https://huggingface.co/datasets/sojuL/RubricHub_v1
• Github: https://github.com/teqkilla/RubricHub
✨ Datasets citing this paper:
• https://huggingface.co/datasets/sojuL/RubricHub_v1
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #GenerativeAI #MachineLearning #NLP #Dataset
✨BAPO: Boundary-Aware Policy Optimization for Reliable Agentic Search
📝 Summary:
Reinforcement learning framework for agentic search that improves reliability by teaching agents to recognize reasoning limits and respond appropriately when evidence is insufficient. AI-generated sum...
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11037
• PDF: https://arxiv.org/pdf/2601.11037
• Github: https://github.com/Liushiyu-0709/BAPO-Reliable-Search
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Reinforcement learning framework for agentic search that improves reliability by teaching agents to recognize reasoning limits and respond appropriately when evidence is insufficient. AI-generated sum...
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11037
• PDF: https://arxiv.org/pdf/2601.11037
• Github: https://github.com/Liushiyu-0709/BAPO-Reliable-Search
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ProFit: Leveraging High-Value Signals in SFT via Probability-Guided Token Selection
📝 Summary:
Supervised fine-tuning with multiple references addresses overfitting to non-core expressions by masking low-probability tokens based on their semantic importance. AI-generated summary Supervised fine...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09195
• PDF: https://arxiv.org/pdf/2601.09195
• Github: https://github.com/Utaotao/ProFit
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Supervised fine-tuning with multiple references addresses overfitting to non-core expressions by masking low-probability tokens based on their semantic importance. AI-generated summary Supervised fine...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09195
• PDF: https://arxiv.org/pdf/2601.09195
• Github: https://github.com/Utaotao/ProFit
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Reasoning Models Generate Societies of Thought
📝 Summary:
Reasoning models demonstrate enhanced performance through multi-agent-like interactions that create diverse cognitive perspectives and improve problem-solving through structured social organization. A...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10825
• PDF: https://arxiv.org/pdf/2601.10825
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Reasoning models demonstrate enhanced performance through multi-agent-like interactions that create diverse cognitive perspectives and improve problem-solving through structured social organization. A...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10825
• PDF: https://arxiv.org/pdf/2601.10825
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨AstroReason-Bench: Evaluating Unified Agentic Planning across Heterogeneous Space Planning Problems
📝 Summary:
Recent advances in agentic Large Language Models (LLMs) have positioned them as generalist planners capable of reasoning and acting across diverse tasks. However, existing agent benchmarks largely foc...
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11354
• PDF: https://arxiv.org/pdf/2601.11354
• Github: https://github.com/Mtrya/astro-reason
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Recent advances in agentic Large Language Models (LLMs) have positioned them as generalist planners capable of reasoning and acting across diverse tasks. However, existing agent benchmarks largely foc...
🔹 Publication Date: Published on Jan 16
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.11354
• PDF: https://arxiv.org/pdf/2601.11354
• Github: https://github.com/Mtrya/astro-reason
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Monolith: Real Time Recommendation System With Collisionless Embedding Table
📝 Summary:
Monolith is a real-time recommendation system designed for online training. It features a collisionless embedding table with memory optimizations and a fault-tolerant architecture, enabling real-time learning by overcoming limitations of general DL frameworks.
🔹 Publication Date: Published on Sep 16, 2022
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2209.07663
• PDF: https://arxiv.org/pdf/2209.07663
• Github: https://github.com/bytedance/monolith
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RecommendationSystems #DeepLearning #MachineLearning #RealTimeAI #DataScience
📝 Summary:
Monolith is a real-time recommendation system designed for online training. It features a collisionless embedding table with memory optimizations and a fault-tolerant architecture, enabling real-time learning by overcoming limitations of general DL frameworks.
🔹 Publication Date: Published on Sep 16, 2022
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2209.07663
• PDF: https://arxiv.org/pdf/2209.07663
• Github: https://github.com/bytedance/monolith
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#RecommendationSystems #DeepLearning #MachineLearning #RealTimeAI #DataScience
✨Agent Lightning: Train ANY AI Agents with Reinforcement Learning
📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent, uniquely decoupling execution from training. It uses a hierarchical RL algorithm to handle complex interactions, enabling seamless integration with existing agents and showing stable improvements.
🔹 Publication Date: Published on Aug 5, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #ReinforcementLearning #LLMs #AIAgents #MachineLearning
📝 Summary:
Agent Lightning is a flexible RL framework for training LLMs in any AI agent, uniquely decoupling execution from training. It uses a hierarchical RL algorithm to handle complex interactions, enabling seamless integration with existing agents and showing stable improvements.
🔹 Publication Date: Published on Aug 5, 2025
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03680
• PDF: https://arxiv.org/pdf/2508.03680
• Project Page: https://www.microsoft.com/en-us/research/project/agent-lightning/
• Github: https://github.com/microsoft/agent-lightning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #ReinforcementLearning #LLMs #AIAgents #MachineLearning