Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Optimizing Grasping in Legged Robots: A Deep Learning Approach to Loco-Manipulation

Created by
  • Haebom

Author

Dilermando Almeida, Guilherme Lazzarini, Juliano Negri, Thiago H. Segreto, Ricardo V. Godoy, Marcelo Becker

Outline

We present a deep learning framework designed to enhance the precise and adaptive grasping capabilities of quadruped robots equipped with limbs and arms. We minimize reliance on real-world data collection by using a sim-to-real approach. We developed a pipeline to generate a synthetic dataset of grasping attempts on common objects within the Genesis simulation environment. Thousands of interactions were simulated from various viewpoints, generating pixel-wise annotated grasp-quality maps to serve as ground truth for the model. This dataset was used to train a custom CNN with a U-Net-like architecture that processes multimodal inputs from onboard RGB and depth cameras. The trained model outputs a grasp-quality heatmap to identify optimal grasp points. We validated the entire framework on a quadruped robot. The system successfully performed a complete loco-manipulation task: autonomously navigating to a target object, detecting the object with sensors, using the model to predict the optimal grasp pose, and performing a precise grasp.

Takeaways, Limitations

Takeaways:
Reduce the need for real-world data collection by utilizing a sim-to-real approach.
Generating and utilizing synthetic datasets within the Genesis simulation environment.
Design of a U-Net-based CNN architecture that processes multi-modal inputs including RGB, depth map, segmentation mask, and surface normal map.
Demonstration of the possibility of performing autonomous loco-manipulation tasks.
Presenting a scalable and effective solution for object processing using simulated learning.
Limitations:
The paper does not specifically mention Limitations (although there may be a sim-to-real gap between the simulated environment and the real environment, which is generally Limitations in a sim-to-real manner).
👍