✨MatchTIR: Fine-Grained Supervision for Tool-Integrated Reasoning via Bipartite Matching
📝 Summary:
MatchTIR enhances LLM reasoning by introducing fine-grained credit assignment through bipartite matching and dual-level advantage estimation for tool-integrated tasks. AI-generated summary Tool-Integr...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10712
• PDF: https://arxiv.org/pdf/2601.10712
• Project Page: https://huggingface.co/collections/ChangleQu/matchtir
• Github: https://github.com/quchangle1/MatchTIR
🔹 Models citing this paper:
• https://huggingface.co/ChangleQu/Qwen3-8B-MatchTIR-KM
• https://huggingface.co/ChangleQu/Qwen3-8B-MatchTIR-OT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
MatchTIR enhances LLM reasoning by introducing fine-grained credit assignment through bipartite matching and dual-level advantage estimation for tool-integrated tasks. AI-generated summary Tool-Integr...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10712
• PDF: https://arxiv.org/pdf/2601.10712
• Project Page: https://huggingface.co/collections/ChangleQu/matchtir
• Github: https://github.com/quchangle1/MatchTIR
🔹 Models citing this paper:
• https://huggingface.co/ChangleQu/Qwen3-8B-MatchTIR-KM
• https://huggingface.co/ChangleQu/Qwen3-8B-MatchTIR-OT
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨FlowAct-R1: Towards Interactive Humanoid Video Generation
📝 Summary:
FlowAct-R1 enables real-time interactive humanoid video generation with high-fidelity synthesis and low-latency responsiveness through MMDiT architecture and chunkwise diffusion forcing strategies. AI...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10103
• PDF: https://arxiv.org/pdf/2601.10103
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
FlowAct-R1 enables real-time interactive humanoid video generation with high-fidelity synthesis and low-latency responsiveness through MMDiT architecture and chunkwise diffusion forcing strategies. AI...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10103
• PDF: https://arxiv.org/pdf/2601.10103
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs
📝 Summary:
Reinforcement learning for large language models is enhanced by a rollout-level objective that rewards rare high-level reasoning strategies, improving diverse solution discovery without sacrificing in...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08763
• PDF: https://arxiv.org/pdf/2601.08763
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Reinforcement learning for large language models is enhanced by a rollout-level objective that rewards rare high-level reasoning strategies, improving diverse solution discovery without sacrificing in...
🔹 Publication Date: Published on Jan 13
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.08763
• PDF: https://arxiv.org/pdf/2601.08763
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning
📝 Summary:
Multi-Agent Test-Time Reinforcement Learning (MATTRL) enhances multi-agent reasoning through structured textual experience injection and consensus-based decision making at inference time. AI-generated...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09667
• PDF: https://arxiv.org/pdf/2601.09667
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Multi-Agent Test-Time Reinforcement Learning (MATTRL) enhances multi-agent reasoning through structured textual experience injection and consensus-based decision making at inference time. AI-generated...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09667
• PDF: https://arxiv.org/pdf/2601.09667
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨EvasionBench: Detecting Evasive Answers in Financial Q&A via Multi-Model Consensus and LLM-as-Judge
📝 Summary:
EvasionBench introduces a large-scale benchmark for detecting evasive responses in earnings calls using a multi-model annotation framework that leverages disagreement between advanced language models ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09142
• PDF: https://arxiv.org/pdf/2601.09142
🔹 Models citing this paper:
• https://huggingface.co/FutureMa/Eva-4B
✨ Spaces citing this paper:
• https://huggingface.co/spaces/FutureMa/financial-evasion-detection
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
EvasionBench introduces a large-scale benchmark for detecting evasive responses in earnings calls using a multi-model annotation framework that leverages disagreement between advanced language models ...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09142
• PDF: https://arxiv.org/pdf/2601.09142
🔹 Models citing this paper:
• https://huggingface.co/FutureMa/Eva-4B
✨ Spaces citing this paper:
• https://huggingface.co/spaces/FutureMa/financial-evasion-detection
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback
📝 Summary:
A guardrail model and reasoning framework are developed to detect and prevent unsafe tool invocations in LLM agents, improving both safety and task performance under adversarial conditions. AI-generat...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10156
• PDF: https://arxiv.org/pdf/2601.10156
• Github: https://github.com/MurrayTom/ToolSafe
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
A guardrail model and reasoning framework are developed to detect and prevent unsafe tool invocations in LLM agents, improving both safety and task performance under adversarial conditions. AI-generat...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10156
• PDF: https://arxiv.org/pdf/2601.10156
• Github: https://github.com/MurrayTom/ToolSafe
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Transition Matching Distillation for Fast Video Generation
📝 Summary:
Transition Matching Distillation enables efficient video generation by distilling diffusion models into few-step predictors using conditional flows and semantic representation decomposition. AI-genera...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09881
• PDF: https://arxiv.org/pdf/2601.09881
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Transition Matching Distillation enables efficient video generation by distilling diffusion models into few-step predictors using conditional flows and semantic representation decomposition. AI-genera...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09881
• PDF: https://arxiv.org/pdf/2601.09881
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨Action100M: A Large-scale Video Action Dataset
📝 Summary:
Action100M is a large-scale video action dataset constructed from internet instructional videos using automated pipelines with V-JEPA embeddings and GPT-based reasoning for structured annotations. AI-...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10592
• PDF: https://arxiv.org/pdf/2601.10592
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Action100M is a large-scale video action dataset constructed from internet instructional videos using automated pipelines with V-JEPA embeddings and GPT-based reasoning for structured annotations. AI-...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10592
• PDF: https://arxiv.org/pdf/2601.10592
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨STEP3-VL-10B Technical Report
📝 Summary:
STEP3-VL-10B is a lightweight 10B multimodal model that rivals much larger models and proprietary flagships in performance. It uses unified pre-training, scaled post-training, and Parallel Coordinated Reasoning for efficient visual reasoning. This open-source model sets a new standard for compact...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09668
• PDF: https://arxiv.org/pdf/2601.09668
• Project Page: https://stepfun-ai.github.io/Step3-VL-10B
• Github: https://github.com/stepfun-ai/Step3-VL-10B
🔹 Models citing this paper:
• https://huggingface.co/stepfun-ai/Step3-VL-10B
• https://huggingface.co/stepfun-ai/Step3-VL-10B-Base
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
STEP3-VL-10B is a lightweight 10B multimodal model that rivals much larger models and proprietary flagships in performance. It uses unified pre-training, scaled post-training, and Parallel Coordinated Reasoning for efficient visual reasoning. This open-source model sets a new standard for compact...
🔹 Publication Date: Published on Jan 14
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.09668
• PDF: https://arxiv.org/pdf/2601.09668
• Project Page: https://stepfun-ai.github.io/Step3-VL-10B
• Github: https://github.com/stepfun-ai/Step3-VL-10B
🔹 Models citing this paper:
• https://huggingface.co/stepfun-ai/Step3-VL-10B
• https://huggingface.co/stepfun-ai/Step3-VL-10B-Base
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
✨PRL: Process Reward Learning Improves LLMs' Reasoning Ability and Broadens the Reasoning Boundary
📝 Summary:
Process Reward Learning decomposes reinforcement learning objectives into intermediate steps to provide fine-grained supervision for improving large language model reasoning abilities. AI-generated su...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10201
• PDF: https://arxiv.org/pdf/2601.10201
• Github: https://github.com/MaxwellJryao/Process-Reward-Learning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research
📝 Summary:
Process Reward Learning decomposes reinforcement learning objectives into intermediate steps to provide fine-grained supervision for improving large language model reasoning abilities. AI-generated su...
🔹 Publication Date: Published on Jan 15
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2601.10201
• PDF: https://arxiv.org/pdf/2601.10201
• Github: https://github.com/MaxwellJryao/Process-Reward-Learning
==================================
For more data science resources:
✓ https://t.iss.one/DataScienceT
#AI #DataScience #MachineLearning #HuggingFace #Research