Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

Created by
  • Haebom

Author

Jiaye Lin, Yifu Guo, Yuzhen Han, Sen Hu, Ziyi Ni, Licheng Wang, Mingguang Chen, Daxin Jiang, Binxing Jiao, Chen Hu, Huacan Wang

Outline

This paper proposes SE-Agent, a novel framework for optimizing the problem-solving process (interaction paths) of agents based on large-scale language models (LLMs). We highlight the inefficiencies of existing methods, such as MCTS, due to interdependencies and a lack of diverse search space. SE-Agent iteratively optimizes the problem-solving process in a self-evolutionary manner through three operations: modifying, recombining, and improving existing paths. This allows it to explore diverse solution paths and mitigate the impact of inefficient paths, thereby improving performance. Experimental results using SWE-bench Verified demonstrate that our approach achieves state-of-the-art performance, achieving up to 55% performance gains on five robust LLMs.

Takeaways, Limitations

Takeaways:
A novel approach to optimizing the problem-solving process of LLM-based agents.
Expanding the search space and improving performance through a self-evolutionary framework.
Efficient learning through reuse of existing routes
Validated practicality and achieved excellent performance (up to 55% performance improvement) by solving actual GitHub issues.
Improving accessibility through open source disclosure
Limitations:
Further research is needed to determine the generalizability of the proposed framework.
Since the evaluation results are for a specific domain (GitHub issue), verification of scalability to other domains is required.
Potential increase in computational cost (repeated computation of self-evolution process)
Dependency on SWE-bench Verified dataset
👍