Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Created by
  • Haebom

Author

Tianqing Fang, Zhisong Zhang, Xiaoyang Wang, Rui Wang, Can Qin, Yuxuan Wan, Jun-Yu Ma, Ce Zhang, Jiaqi Chen, Xiyun Li, Hongming Zhang, Haitao Mi, Dong Yu

Outline

Cognitive Kernel-Pro is a fully open-source, free, and multi-modular agent framework for general AI agents, enabling complex reasoning, web interaction, coding, and autonomous research capabilities. This study systematically examines the curation of high-quality training data for agent-based models, constructing questions, paths, and verifiable answers across four key domains: web, files, code, and general reasoning. Furthermore, we explore novel strategies for agent test-time reflection and voting to enhance agent robustness and performance. Evaluated against GAIA, the model achieved state-of-the-art performance among open-source and free agents. Specifically, the 8-billion-parameter open-source model outperforms previous leading systems such as WebDancer and WebSailor, establishing a new performance standard for accessible and high-performance AI agents.

Takeaways, Limitations

Takeaways:
Improving research accessibility and reproducibility through an open-source, freely available general artificial intelligence agent framework.
Presents a high-quality training data curation strategy and builds datasets for four key areas: web, files, code, and general inference.
Improve agent robustness and performance through agent test time reflection and voting strategies.
An 8 billion-parameter open-source model outperforms the best-performing existing systems.
Contributing to the democratization of research and development of general artificial intelligence agents.
Limitations:
Limitations is not specifically mentioned in the paper. Future research may require further improvements and extensions (e.g., performance limitations for certain tasks, limited support for various languages, etc.).
👍