Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

TReF-6: Inferring Task-Relevant Frames from a Single Demonstration for One-Shot Skill Generalization

Created by
  • Haebom

Author

Yuxuan Ding, Shuangge Wang, Tesca Fitzgerald

TReF-6: 6DoF Action-Related Frame Inference for Generalization from a Single Demonstration

Outline

This study addresses the difficulty of robots generalizing from a single demonstration, focusing on the lack of a transferable and interpretable spatial representation. TReF-6 presents a method for inferring simplified 6DoF task-related frames from a single trajectory. This method identifies influence points from trajectory geometry, defines the origin of local frames, and uses these to parameterize Dynamic Movement Primitives (DMPs). The inferred frames are semantically linked through a vision-language model and localized in a new environment using Grounded-SAM, enabling functionally consistent skill generalization. We validate TReF-6 in simulations, demonstrating its robustness to trajectory noise. We deploy an end-to-end pipeline for real-world manipulation tasks, demonstrating that it supports one-shot imitation learning that preserves task intent across diverse object configurations.

Takeaways, Limitations

Takeaways:
Improve robot generalization ability by inferring 6DoF task-related frames from a single demonstration.
Leveraging DMP to capture the spatial structure of a task and provide capabilities beyond start-goal imitation.
Leveraging vision-language models and Grounded-SAM to ensure adaptability in new environments.
Demonstrates practicality by demonstrating validity in both simulation and real-world environments.
Limitations:
The specific Limitations is not explicitly mentioned in the abstract.
👍