Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

CUPID: Curating Data your Robot Loves with Influence Functions

Created by
  • Haebom

Author

Christopher Agia, Rohan Sinha, Jingyun Yang, Rika Antonova, Marco Pavone, Haruki Nishimura, Masha Itkina, Jeannette Bohg

Outline

This paper highlights that policy performance in robot imitation learning is highly dependent on the quality and composition of demonstration data, yet it is challenging to precisely understand how individual demonstrations contribute to outcomes such as closed-loop task success or failure. Therefore, we propose CUPID, a robot data curation method based on a novel influence function theoretical formulation for imitation learning policies. CUPID estimates the impact of each training demonstration on the expected return of a policy by considering a set of evaluation rollouts, allowing it to rank and select demonstrations based on their impact on the policy's closed-loop performance. CUPID is used to curate data by filtering out training demonstrations detrimental to policy performance and subselecting novel trajectories that most likely improve the policy. Simulation and hardware experiments demonstrate that the method consistently identifies data that drives performance at test time. For example, state-of-the-art diffusion policies can be achieved on simulated RoboMimic benchmarks by training with less than 33% of the curated data, and similar performance improvements are observed on hardware. Furthermore, hardware experiments demonstrate that it can identify strategies robust to distributional shifts, isolate spurious correlations, and even improve the post-training performance of common robot policies. Code and video are available at https://cupid-curation.github.io .

Takeaways, Limitations

Takeaways:
A novel approach to improving policy performance through data curation in imitation learning is presented.
Demonstrating the feasibility of achieving cutting-edge performance even with small amounts of data.
Verify robustness to distributional changes and the possibility of eliminating spurious correlations.
Presenting the possibility of improving the performance of general robot policies.
Limitations:
The effectiveness of the proposed method may vary depending on the dataset and task used.
Computational cost of calculating influence functions can be high.
Further research is needed on generalization performance in real robotic systems.
👍