Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

OE3DIS: Open-Ended 3D Point Cloud Instance Segmentation

Created by
  • Haebom

Author

Phuc DA Nguyen, Minh Luu, Anh Tran, Cuong Pham, Khoi Nguyen

Outline

This paper presents the Open-Kind 3D Instance Segmentation (OE-3DIS) problem, which enables novel object segmentation without predefined class names. Existing open-vocabulary 3D instance segmentation (OV-3DIS) methods suffer from the limitation of relying on predefined class names during testing; OE-3DIS alleviates this limitation. We build a robust baseline model by leveraging the OV-3DIS approach and a 2D multimodal large-scale language model, and evaluate its performance using a novel Open-Kind Score and a standardized AP score, which assess the semantic and geometric quality of predicted masks and their associated class names. On the ScanNet200 and ScanNet++ datasets, we achieve significant performance improvements over the baseline model, and even outperform the previous state-of-the-art OV-3DIS method, Open3DIS.

Takeaways, Limitations

Takeaways:
By defining the OE-3DIS problem, which enables 3D instance segmentation without predefined class names, and presenting a powerful baseline model and evaluation metrics for it, we contribute to the development of more autonomous 3D object recognition systems.
Improving 3D instance segmentation performance using 2D multimodal large-scale language models.
A new open kind score enables comprehensive assessment of semantic and geometric quality.
Achieve performance that surpasses existing top-performing models.
Limitations:
Further experiments are needed to evaluate the generalization performance of the proposed method.
Additional performance evaluations on various 3D datasets are needed.
Due to the high reliance on 2D multimodal large-scale language models, the limitations of the models may affect the performance of OE-3DIS.
👍