Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Boosting Omnidirectional Stereo Matching with a Pre-trained Depth Foundation Model

Created by
  • Haebom

Author

Jannik Endres, Oliver Hahn, Charles Corbière , Simone Schaub-Meyer, Stefan Roth, Alexandre Alahi

Outline

This paper proposes DFI-OmniStereo, a novel method for omnidirectional depth perception. It aims to generate high-resolution depth maps through low-cost stereo depth estimation based on omnidirectional cameras. To overcome the limitations of existing methods, we utilize a large-scale pre-trained base model to perform relative monocular depth estimation within an iterative optimization-based stereo matching architecture. Specifically, we utilize relative monocular depth features through a two-step training strategy to perform scale-invariant fine-tuning. On the real-world dataset Helvipad, we achieve state-of-the-art results, reducing disparity MAE by approximately 16% compared to the best-performing omnidirectional stereo method.

Takeaways, Limitations

Takeaways:
We improved the accuracy of omnidirectional stereo matching by leveraging a large-scale pre-trained base model.
A novel two-step training strategy effectively utilizes relative monocular depth information.
We achieved results that surpassed previous state-of-the-art performance on the Helvipad dataset.
This could bring significant advancements to mobile robotics, which requires omnidirectional depth perception.
Limitations:
Further evaluation is needed to determine how well the proposed method generalizes to a specific dataset (Helvipad).
There is a need to further improve robustness across different environments, depth ranges, and lighting conditions.
There is a lack of analysis on computational costs and real-time processing potential.
👍