[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

One Step Closer: Creating the Future to Boost Monocular Semantic Scene Completion

Created by
  • Haebom

Author

Haoang Lu, Yuanqi Su, Xiaoning Zhang, Hao Hu

Outline

This paper addresses the problem of visual 3D semantic scene completion (SSC), which infers the complete 3D scene layout and semantics from a single 2D image. To overcome the limitations of existing monocular SSC methods that cannot sufficiently cover real-world traffic situations where a significant portion of the scene is occluded or out of the camera view, this paper proposes Creating the Future SSC (CF-SSC), a novel temporal SSC framework that extends the effective perceptual range of the model by leveraging pseudo-future frame prediction. CF-SSC establishes accurate 3D correspondences by combining pose and depth, and geometrically consistently fuses past, present, and predicted future frames in 3D space. Unlike existing methods that rely on simple feature stacking, our 3D perception architecture explicitly models the spatiotemporal relationships to achieve more robust scene completion. We demonstrate state-of-the-art performance through comprehensive experiments on SemanticKITTI and SSCBench-KITTI-360 benchmarks, validating the effectiveness of our method in occluded part inference and improving 3D scene completion accuracy.

Takeaways, Limitations

Takeaways:
We effectively solve the occluded part problem, which is a limitation of monocular SSC, by utilizing pseudo-future frame prediction.
We present a new architecture that geometrically and consistently fuses past, present, and future frames in 3D space.
Achieves state-of-the-art performance on SemanticKITTI and SSCBench-KITTI-360 benchmarks.
We demonstrate more robust scene completion performance by explicitly modeling spatiotemporal relationships.
Limitations:
There is a lack of analysis on the computational cost and real-time feasibility of the proposed method.
Additional evaluation of robustness to various weather conditions or changes in light sources is needed.
Performance can be significantly affected by the accuracy of predicted future frames. A robust handling strategy for prediction errors is required.
👍