Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

RT-Cache: Training-Free Retrieval for Real-Time Manipulation

Created by
  • Haebom

Author

Owen Kwon, Abraham George, Alison Bartsch, Amir Barati Farimani

Outline

This paper proposes RT-Cache, a training-free search-based control pipeline that overcomes the limitations of existing controllers, which often have high step-by-step inference costs or require fine-tuning during deployment, for real-world robots that must repeat identical actions in novel environments with very little data. RT-Cache caches various image-action paths in a unified vector memory and replaces step-by-step model invocation by embedding the current frame at test time to retrieve and replay multi-step snippets. Hierarchical search maintains sub-second search performance even at scales in the millions, translating computational costs into storage capacity and enabling real-time control on modest GPUs. On real-world robot tasks and large-scale open logs, RT-Cache achieves higher success rates and shorter completion times than robust search baseline models (approximately 2X higher success rates and ~30% faster speedup in our setup). Single-episode fixation studies demonstrate immediate adaptation to more complex, high-touch tasks without fine-tuning. RT-Cache provides a foundation for simple, scalable paths that can be deployed in a few trials, and for optional integration with multi-mode keys and higher-level policies by transferring experience to additional dedicated memory.

Takeaways, Limitations

Takeaways:
We present a real-time robot control system that can adapt to new environments without training.
Achieve higher success rates and faster completion times than existing methods.
Demonstrates immediate adaptation to complex tasks with single-episode learning.
Enables scalable and easy few-shot deployments.
Offers the possibility of integration with multi-mode keys and higher-level policies.
Limitations:
Further research is needed on memory size and search speed.
Generalization performance evaluation is needed for various environments and tasks.
Long-term performance and stability evaluations are needed for complex tasks.
Lack of specific methodological proposals for integration with higher-level policies.
👍