Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

A$^2$FM: An Adaptive Agent Foundation Model for Tool-Aware Hybrid Reasoning

Created by
  • Haebom

Author

Qianben Chen, Jingyi Cao, Jiayu Zhang, Tianrui Qin, Xiaowan Li, King Zhu, Dingfeng Shi, He Zhu, Minghao Liu, Xiaobo Liang, Xin Gui, Ge Zhang, Jian Yang, Yuchen Eleanor Jiang, Wangchunshu Zhou

Outline

This paper proposes the Adaptive Agent Foundation Model (A$^2$FM), which integrates the conflicting strengths of inference-driven LLM and agent-driven LLM to achieve both accuracy and efficiency without excessive thinking or unnecessary tool calls. A$^2$FM learns task-aware routing and follows the route-then-align principle to align mode-specific trajectories, and introduces an instant mode that directly processes simple queries to improve inefficiency. Furthermore, we improve both accuracy and efficiency by applying adaptive sampling across modes and cost-regulated rewards through Adaptive Policy Optimization (APO). At a scale of 32B, A$^2$FM achieves state-of-the-art performance on BrowseComp, AIME25, and HLE benchmarks, while also significantly improving cost efficiency.

Takeaways, Limitations

Takeaways:
Achieving SOTA across a range of tasks by integrating reasoning and agent capabilities.
Improved efficiency by introducing instant mode for simple queries.
Simultaneously improving accuracy and cost-effectiveness through Adaptive Policy Optimization (APO).
Cost-effectiveness: 33.5% to 45.2% savings compared to existing models.
Limitations:
The model size is specified as 32B, so performance verification on larger models is required.
Further research is needed to determine the generalizability of the proposed methodology.
Lack of information about the specific implementation details and learning process of A$^2$FM.
👍