Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Blending 3D Geometry and Machine Learning for Multi-View Stereopsis

Created by
  • Haebom

Author

Vibhas Vats, Md. Alimoor Reza, David Crandall, Soon-heung Jung

Outline

While existing multi-view stereoscopic (MVS) methods primarily rely on photometric and geometric consistency constraints, recent learning-based algorithms rely on planar sweep algorithms to infer 3D geometry, and explicit geometric consistency (GC) checks are applied only in the postprocessing stage and do not affect the training process itself. In this study, we present GC MVSNet plus plus, a novel method that actively enforces geometric consistency of reference view depth maps across multiple source views (multi-view) and scales (multi-scale) during the training stage. This integrated GC check significantly accelerates the training process by directly penalizing geometrically inconsistent pixels, reducing the number of training iterations by half compared to other MVS methods. We also present a densely connected cost regularization network with two unique block designs (simple and feature-dense) optimized to leverage dense feature connections for improved regularization. Extensive experiments demonstrate that the proposed method achieves state-of-the-art performance on the DTU and BlendedMVS datasets and ranks second on the Tanks and Temples benchmark. GC MVSNet plus plus is the first method to enhance supervised geometric consistency during training across multiple views and multiple scales. The code is publicly available.

Takeaways, Limitations

Takeaways:
We improved MVS performance by strengthening multi-view, multi-scale geometric consistency during the learning phase.
We significantly improved learning speed by incorporating geometric consistency checks (halving the number of learning iterations).
It achieves state-of-the-art performance on the DTU and BlendedMVS datasets and ranks second on the Tanks and Temples benchmarks.
Improved regulatory performance through a densely connected cost regulation network.
Increased reproducibility and scalability through source code disclosure.
Limitations:
It is possible that the performance improvements of the proposed method may be limited to specific datasets.
Further research is needed on generalization performance in more complex and diverse environments.
A more comprehensive comparative analysis with other state-of-the-art MVS methods may be needed.
👍