Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

V-HOP: Visuo-Haptic 6D Object Pose Tracking

Created by
  • Haebom

Author

Hongyu Li, Mingxi Jia, Tuluhan Akbulut, Yu Xiang, George Konidaris, Srinath Sridhar

Outline

This paper presents a novel method for improving the accuracy and robustness of object pose estimation by integrating visual and haptic information. To address the challenges of previous studies, which include diverse grippers, sensor placements, lack of generalization between simulation and real environments, and inconsistencies in tracking due to frame-by-frame independent estimation, we propose a unified tactile representation that effectively handles multiple gripper implementations and a visual-haptic transformer-based object pose tracker that seamlessly integrates visual and haptic inputs. The proposed method achieves excellent generalization and robustness across diverse implementations, objects, and sensor types (both taxon-based and vision-based tactile sensors), significantly outperforming state-of-the-art visual trackers in real-world experiments. Furthermore, we demonstrate that real-time object tracking can be integrated into motion planning to enable precise manipulation tasks.

Takeaways, Limitations

Takeaways:
We present a novel visual-tactile integrated object pose tracking method with excellent generalization performance for various grippers and sensors.
Achieves significantly higher performance than state-of-the-art visual trackers in real-world environments.
Demonstrated the feasibility of performing precise manipulation tasks based on real-time object tracking.
Effectively handle various gripper implementations through integrated tactile representation.
Limitations:
There is a possibility that the performance of the proposed method may be biased on certain datasets.
Further research is needed on robustness to various noises and obstacles in real environments.
Further validation of generalization performance across different object types and shapes is needed.
👍