Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Created by
  • Haebom

Author

Zeyi Sun, Yuhang Cao, Jianze Liang, Qiushi Sun, Ziyu Liu, Zhixiong Zhang, Yuhang Zang, Xiaoyi Dong, Kai Chen, Dahua Lin, Jiaqi Wang

Outline

This paper presents a novel approach to address the design of autonomous agents for graphical user interfaces (GUIs) in specialized fields such as scientific computing. This approach overcomes the limitations of existing general and expert agents in situations requiring both long-term planning and precise execution. While existing approaches face a tradeoff between planning and execution capabilities, we present CODA, a learnable, compositional framework that integrates a general planner (Cerebrum) and an expert executor (Cerebellum). CODA is trained through a two-stage pipeline. In the first stage, Specialization, expert planners are trained individually for each scientific application. In the second stage, Generalization, all successful trajectories are aggregated and used for supervised fine-tuning of the final planner. This ensures that CODA possesses both robust execution and cross-domain generalization capabilities. On four tasks of the ScienceBoard benchmark, CODA significantly outperforms existing methods and achieves the highest performance among open-source models.

Takeaways, Limitations

Takeaways:
A novel approach to improving the performance of GUI autonomous agents in scientific computing.
Overcoming existing limitations by combining general planning skills with professional execution skills
Adaptability from experience through a learnable, configurable framework
Achieve effective performance even in limited data environments
Highest performance among open source models
Limitations:
Further evaluation of the generalizability of the proposed framework is needed.
Scalability verification is required for various scientific fields and more complex GUI environments.
Performance evaluation on benchmarks other than the ScienceBoard benchmark is required.
Need to assess the dependence on the quality of training data
👍