ML Research Hub

✨Scaling Agents via Continual Pre-training

📝 Summary:
Current agentic LLMs underperform due to training tensions. This paper proposes Agentic Continual Pre-training CPT to build powerful agentic foundation models. Their AgentFounder model achieves state-of-the-art performance on benchmarks with strong tool-use.

🔹 Publication Date: Published on Sep 16

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2502.06589
• PDF: https://arxiv.org/pdf/2509.13310
• Project Page: https://tongyi-agent.github.io/blog/
• Github: https://tongyi-agent.github.io/blog/

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLMAgents #ContinualPretraining #FoundationModels #AIResearch #ToolUse

60 views05:58

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨In-the-Flow Agentic System Optimization for Effective Planning and Tool Use

📝 Summary:
AgentFlow is a trainable agentic framework that optimizes its planner in-the-flow within multi-turn interactions. It uses Flow-GRPO to train its modules and significantly outperforms top baselines and GPT-4o on various reasoning and tool-use tasks.

🔹 Publication Date: Published on Oct 7

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2510.05592
• PDF: https://arxiv.org/pdf/2510.05592
• Project Page: https://agentflow.stanford.edu/
• Github: https://github.com/lupantech/AgentFlow

✨ Spaces citing this paper:
• https://huggingface.co/spaces/AgentFlow/agentflow
• https://huggingface.co/spaces/bioliveir4/agentflow2
• https://huggingface.co/spaces/bioliveir4/agentflow

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AI #MachineLearning #AIagents #ToolUse #Planning

389 views18:04

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨LoopTool: Closing the Data-Training Loop for Robust LLM Tool Calls

📝 Summary:
LoopTool is an automated framework that closes the data-training loop for LLMs. It iteratively refines data and models to improve tool-use capabilities, achieving state-of-the-art results and surpassing larger models cost-effectively.

🔹 Publication Date: Published on Nov 12

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.09148
• PDF: https://arxiv.org/pdf/2511.09148
• Github: https://github.com/Rednote-ExperienceAI-Lab/LoopTool

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#LLM #AI #MachineLearning #DataScience #ToolUse

286 views05:02

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨M3-Bench: Multi-Modal, Multi-Hop, Multi-Threaded Tool-Using MLLM Agent Benchmark

📝 Summary:
M3-Bench is a new benchmark evaluating multimodal LLM agent tool use in complex, multi-hop workflows requiring visual grounding and tool dependencies. It introduces a similarity-driven alignment method and interpretable metrics. Evaluations show significant gaps in current MLLMs, especially in ar...

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17729
• PDF: https://arxiv.org/pdf/2511.17729
• Github: https://github.com/EtaYang10th/Open-M3-Bench

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#MLLM #LLMAgents #AI #Benchmarking #ToolUse

332 views03:00

✨ Explore Data Science 📝 Write your paper

ML Research Hub

✨Budget-Aware Tool-Use Enables Effective Agent Scaling

📝 Summary:
Tool-augmented agents struggle to scale with more tool calls due to a lack of budget awareness. This paper introduces Budget Tracker for continuous budget awareness and BATS for adaptive planning, dynamically adjusting strategy based on remaining resources. These methods significantly improve cos...

🔹 Publication Date: Published on Nov 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2511.17006
• PDF: https://arxiv.org/pdf/2511.17006

==================================

For more data science resources:
✓ https://t.iss.one/DataScienceT

#AIAgents #ToolUse #ResourceManagement #AgentScaling #AIResearch

176 views04:02

✨ Explore Data Science 📝 Write your paper

About

Blog

Apps

Platform