Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Towards Embodied Agentic AI: Review and Classification of LLM- and VLM-Driven Robot Autonomy and Interaction

Created by
  • Haebom

Author

Sahar Salimpour, Lei Fu, Farhad Keramat, Leonardo Militano, Giovanni Toffetti, Harry Edelman, Jorge Pe na Queralta

Outline

This paper examines recent research trends in which foundational models, including large-scale language models (LLMs) and vision-language models (VLMs), have enabled novel approaches to robotic autonomy and human-robot interfaces. Specifically, we focus on how vision-language-action models (VLAs) and large-scale behavior models (LBMs) contribute to enhancing the proficiency and functionality of robotic systems, and we review research moving toward agent-based applications and architectures. These studies range from exploring GPT-style tool interfaces to more complex systems in which AI agents act as coordinators, planners, cognitive agents, or general interfaces. These agent architectures enable robots to understand natural language commands, invoke APIs, plan task sequences, and support operations and diagnostics. Reflecting the rapidly evolving nature of this field, we cover not only peer-reviewed research but also community-driven projects, ROS packages, and industry frameworks. We propose a taxonomy for categorizing model integration approaches and provide a comparative analysis of the role agents play in various solutions across the current literature.

Takeaways, Limitations

Takeaways:
A comprehensive presentation of research trends in robot autonomy and human-robot interface based on basic models.
Systematically analyze various approaches and roles in agent-based robotics architecture.
Comprehensive reflection of the latest trends in the field, including community-led projects and industry frameworks.
Contributing to the advancement of the research field by proposing a classification system for model integration approaches.
Limitations:
This is a presentation of research trends based on the date of publication of the paper (August 2025), and may not reflect subsequent technological advancements.
Although various models and frameworks have been compared and analyzed, quantitative performance comparisons may be limited.
Lack of in-depth discussion on the safety and reliability of agent-based architectures.
👍