Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Atom-Searcher: Enhancing Agentic Deep Research via Fine-Grained Atomic Thought Reward

Created by
  • Haebom

Author

Yong Deng, Guoqing Wang, Zhenzhe Ying, Xiaofeng Wu, Jinzhen Lin, Wenwen Xiong, Yuqin Dai, Shuo Yang, Zhanwei Zhang, Qiwen Wang, Yang Qin, Changhua Meng

Outline

This paper proposes Atom-Searcher, a novel framework for enhancing the complex problem-solving capabilities of large-scale language models (LLMs). To overcome the limitations of existing augmented search generation (RAG) approaches, we focus on agent-based deep learning, where LLMs autonomously perform inference, search, and information synthesis. To address the inherent challenges of outcome-based reinforcement learning (RL) approaches, such as conflicting gradients and reward sparsity, we present Atomic Thought, a novel approach that decomposes the inference process into fine-grained functional units. This approach accelerates convergence to efficient inference paths by leveraging Reasoning Reward Models (RRMs) and Atomic Thought Rewards (ATRs), which provide fine-grained guidance for the inference process. A curriculum-based reward schedule prioritizes process-level ATRs and gradually transitions to outcome-level rewards. Through seven benchmark experiments, we demonstrate that our approach outperforms existing state-of-the-art methods, demonstrating the scalability of test-time calculations, providing a supervision criterion for RRMs, and demonstrating more interpretable and human-like reasoning patterns.

Takeaways, Limitations

Takeaways:
Presenting a new approach to improving LLM students' complex problem-solving skills.
Solving the multi-stage inference and strategic search problem of Limitations using the existing RAG method
Overcoming the limitations of outcome-based RL and improving learning efficiency.
Ensuring scalability of test time calculations
Implementing more interpretable and human-like reasoning patterns
Providing effective oversight standards for RRM through Atomic Thought
Limitations:
Further verification of the generalization performance of the proposed Atom-Searcher is needed.
Need to evaluate applicability and performance for various types of problems
Further research is needed on RRM design and ATR definition.
The need for large datasets and computational resources
👍