Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Learning Pivoting Manipulation with Force and Vision Feedback Using Optimization-based Demonstrations

Created by
  • Haebom

Author

Yuki Shirai, Kei Ota, Devesh K. Jha, Diego Romeres

Outline

This paper presents a novel framework that combines model-based and learning-based approaches to solve the problem of non-prehensile manipulation. Combining the efficiency of model-based approaches with the robustness of learning-based approaches, we achieve sample-efficient learning by designing a demonstration-guided deep reinforcement learning (RL) based on computationally efficient contact implicit trajectory optimization (CITO) that explicitly considers contact points. Furthermore, we present a simulation-to-real transfer approach using a privileged training strategy to enable a robot to perform pivot manipulation using only proprioception, vision, and force sensing, without privileged information (e.g., object mass, size, or pose). Evaluation on multiple pivot tasks demonstrates the successful implementation of the simulation-to-real transfer. Further details can be found in the video provided at the YouTube link.

Takeaways, Limitations

Takeaways:
By combining the advantages of model-based and learning-based approaches, we present an efficient and robust solution to the non-contact manipulation problem.
Achieve sample-efficient learning using CITO and demo-guided RL.
We present a method for successfully performing simulation-to-real transitions without privileged information.
We verify the performance through experimental results in an actual robot system.
Limitations:
Further research is needed to determine the generalization performance of the proposed method.
There is a need to further improve robustness against a variety of environments and objects.
The dependency of privilege training strategies may limit the scalability of the system.
👍