Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

RoboTwin 2.0: A Scalable Data Generator and Benchmark with Strong Domain Randomization for Robust Bimanual Robotic Manipulation

Created by
  • Haebom

Author

Tianxing Chen, Zanxin Chen, Baijun Chen, Zijian Cai, Yibin Liu, Zixuan Li, Qiwei Liang, Xianliang Lin, Yiheng Ge, Zhenyu Gu, Weiliang Deng, Yubin Guo, Tian Nian, Xuanbing Zhixuan Liang, Yusen Qin, Xiaokang Yang, Ping Luo, Yao Mu

Outline

RoboTwin 2.0 is a large-scale, diverse, and realistic data generation framework for scalable dual-arm manipulation. To overcome the limitations of existing datasets (lack of scalable task generation methods and oversimplified simulation environments), we designed an expert data synthesis pipeline utilizing a multimodal language model (MLLM) and simulation-based refinement based on the RoboTwin-OD object library, which contains 731 object instances (147 categories). We applied structured domain randomization across five axes (clutter, lighting, background, table height, and language) to improve simulation-to-reality transfer and enhance data diversity and policy robustness. Applying this framework to 50 dual-arm tasks and five robot models, we achieved a 10.9% improvement in code generation success rate, a 367% relative performance improvement when training a VLA model using synthetic data and 10 real-world demos, and a 228% performance improvement over a zero-shot model trained solely on synthetic data. We support scalable, robust dual-arm manipulation research by releasing data generators, benchmarks, datasets, and code.

Takeaways, Limitations

Takeaways:
Providing a large-scale, diverse, and realistic synthetic data generation framework for scalable dual-arm manipulation.
An efficient task generation pipeline is presented using a multimodal language model and simulation-based improvements.
Improving simulation-to-real transition performance and ensuring robustness to environmental changes through structured domain randomization.
Effective policy learning and zero-shot performance improvement using synthetic data.
Providing research sharing and scalability through data generators, benchmarks, datasets, and code disclosure.
Limitations:
The variety of robot models and tasks currently supported may be limited.
It is difficult to achieve a perfect match with the real environment, so additional adjustments may be required when applying to the real environment.
The quality of data generation may be affected by the performance of MLLM.
The scope of structured domain randomization needs to be further expanded.
👍