This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
This paper proposes the Adaptive Agent Foundation Model (A$^2$FM), which integrates the conflicting strengths of inference-driven LLM and agent-driven LLM to achieve both accuracy and efficiency without excessive thinking or unnecessary tool calls. A$^2$FM learns task-aware routing and follows the route-then-align principle to align mode-specific trajectories, and introduces an instant mode that directly processes simple queries to improve inefficiency. Furthermore, we improve both accuracy and efficiency by applying adaptive sampling across modes and cost-regulated rewards through Adaptive Policy Optimization (APO). At a scale of 32B, A$^2$FM achieves state-of-the-art performance on BrowseComp, AIME25, and HLE benchmarks, while also significantly improving cost efficiency.
Takeaways, Limitations
•
Takeaways:
◦
Achieving SOTA across a range of tasks by integrating reasoning and agent capabilities.
◦
Improved efficiency by introducing instant mode for simple queries.
◦
Simultaneously improving accuracy and cost-effectiveness through Adaptive Policy Optimization (APO).
◦
Cost-effectiveness: 33.5% to 45.2% savings compared to existing models.
•
Limitations:
◦
The model size is specified as 32B, so performance verification on larger models is required.
◦
Further research is needed to determine the generalizability of the proposed methodology.
◦
Lack of information about the specific implementation details and learning process of A$^2$FM.