Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

From LLMs to Actions: Latent Codes as Bridges in Hierarchical Robot Control

Created by
  • Haebom

Author

Yide Shentu, Philipp Wu, Aravind Rajeswaran, Pieter Abbeel

Outline

In this paper, we propose Learnable Latent Codes as Bridges (LCB), a novel approach to overcome the limitations of using large-scale language models (LLMs) as an interface layer to address the need for a well-defined interface layer for communication between high-level task planners and low-level policies in robot control. Existing LLM-based approaches have limitations in that they are difficult to express in natural language (e.g., dance moves) or in that transfer learning is difficult due to domain shift and catastrophic forgetting. LCB uses learnable latent codes as a bridge between LLMs and low-level policies, allowing LLMs to flexibly convey goals without linguistic constraints and enabling transfer learning without destroying the embedding space of pre-learned word tokens during transfer learning. Through the Language Table and Calvin benchmarks, we experimentally verify that LCB outperforms existing approaches (including GPT-4V) that use pure languages as an interface layer in tasks requiring inference and multi-step actions.

Takeaways, Limitations

Takeaways:
We propose a novel approach to overcome language limitations and solve the difficulties of transfer learning in utilizing LLM as an interface layer for robot control.
By leveraging learnable latent codes to enable efficient communication between LLM and low-level policies, we improve the ability to perform complex multi-step tasks.
It showed superior performance in tasks requiring inference and multi-step operations than existing LLM-based methods.
Limitations:
Further studies are needed to investigate the generalization performance of the proposed LCB method.
Further verification of its applicability to various robotic platforms and tasks is required.
Optimization studies on the dimensionality and learning method of latent codes may be required.
👍