This paper proposes GLSim, a novel framework for improving the reliability of object hallucination detection in large-scale vision-language models. Unlike existing methods that consider only global or local perspectives, GLSim combines complementary information by leveraging global and local embedding similarity signals between image and text modes. Experimental results demonstrate that GLSim outperforms existing methods in object hallucination detection.