Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Learning from 10 Demos: Generalisable and Sample-Efficient Policy Learning with Oriented Affordance Frames

Created by
  • Haebom

Author

Krishan Rana, Jad Abou-Chakra, Sourav Garg, Robert Lee, Ian Reid, Niko Suenderhauf

Outline

This paper highlights that while imitation learning enables skilled robot behavior, it struggles with low sample efficiency and limited generalization, making it difficult to address long-term, multi-object tasks. Existing methods require numerous demonstrations to address possible task variations, making them costly and impractical for real-world applications. This study introduces oriented affordance frames, a structured representation of state and action spaces, to improve spatial and category generalization and efficiently train policies with just 10 demonstrations. More importantly, this abstraction enables the compositional generalization of independently trained subpolicies to address long-term, multi-object tasks. To facilitate smooth transitions between subpolicies, we introduce the concept of self-progress prediction, derived directly from the duration of training demonstrations. Experiments on three real-world tasks involving multi-step, multi-object interactions demonstrate that the policies generalize robustly to unseen object appearances, geometric shapes, and spatial arrangements, despite a small amount of data, and achieve high success rates without relying on extensive training data.

Takeaways, Limitations

Takeaways:
We demonstrate that efficient policy learning is possible with only a small number of demos (10) by utilizing the directional affordance frame.
Improved generalization performance within space and categories.
Solving long-term, multi-object tasks through constructive generalization of independently trained sub-policies.
Self-progress prediction enables smooth transitions between sub-policies.
Achieving high success rates in real-world tasks and verifying generalization performance.
Limitations:
Only experimental results for a limited number of real-world tasks (three) are presented.
Further research is needed to determine how well generalization performance can be maintained across different environments and tasks.
Further analysis is needed on the accuracy and reliability of self-progression predictions.
Lack of analysis of the computational cost and complexity of the proposed method.
👍