Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

GLSim: Detecting Object Hallucinations in LVLMs via Global-Local Similarity

Created by
  • Haebom

Author

Seongheon Park, Yixuan Li

Outline

This paper proposes GLSim, a novel framework for improving the reliability of object hallucination detection in large-scale vision-language models. Unlike existing methods that consider only global or local perspectives, GLSim combines complementary information by leveraging global and local embedding similarity signals between image and text modes. Experimental results demonstrate that GLSim outperforms existing methods in object hallucination detection.

Takeaways, Limitations

Takeaways:
We demonstrate that integrating global and local information between image and text modes can improve the accuracy and reliability of object hallucination detection.
Presents a new approach that overcomes the limitations of existing methods.
A training-free approach, increasing ease of application.
Limitations:
Further validation is needed to determine whether GLSim's performance is consistently superior across a variety of scenarios.
Performance degradation may still occur for certain types of object hallucinations.
Further research is needed to determine the generalizability of the experimental results presented in this paper.
👍