Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

X-Sim: Cross-Embodiment Learning via Real-to-Sim-to-Real

Created by
  • Haebom

Author

Prithwish Dan, Kushal Kedia, Angela Chao, Edward Weiyi Duan, Maximus Adrian Pace, Wei-Chiu Ma, Sanjiban Choudhury

Outline

This paper proposes a real-to-sim-to-real framework, called X-Sim. Instead of mimicking human motion, X-Sim extracts object motion from RGBD images to define object-centric rewards, which are then used to train a reinforcement learning (RL) agent. The learned policy is distilled into an image-conditional diffusion policy using synthetic rollouts rendered with various viewpoints and lighting. To transfer to the real environment, we align real and simulated observations using online domain adaptation. We demonstrate an average 30% improvement in performance across five manipulation tasks without requiring robot teleoperation data, achieve the same performance with 10x less data acquisition time than existing methods, and demonstrate good generalization to new camera viewpoints and testing time variations.

Takeaways, Limitations

Takeaways:
We demonstrate that learning robot manipulation policies is possible without imitating human motion.
Improving real-to-simulation transfer performance using object-centric compensation.
Increasing real-world applicability through online domain adaptation techniques.
Reduce data collection time and improve generalization performance.
Limitations:
Depends on RGBD image data.
It is not a perfect solution to the gap between simulated and real environments.
Generalization performance to tasks other than the five presented manipulation tasks requires further validation.
👍