This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
This paper highlights the limitations of LLM agents in information seeking and introduces InfoMosaic-Bench, a new benchmark designed to evaluate their ability to integrate specialized tools with general search. This benchmark involves tasks that require combining general search with domain-specific tools across various domains, and experiments reveal that LLM agents struggle with this integration.
Takeaways, Limitations
•
Takeaways:
◦
Web information alone is not enough; leveraging domain-specific tools is essential.
◦
Domain tools offer optional benefits, but lack consistency.
◦
LLM agents struggle with using and selecting tools.
•
Limitations:
◦
Current LLM agents' lack of tool utilization skills.
◦
Difficulties in integrating tools and navigating complex information tasks.