This paper highlights that despite rapid progress in practical applications of multilingual large-scale language models (MLLMs), achieving consistent performance across languages remains a significant challenge, especially when incorporating cultural knowledge. To better assess this issue, the researchers present two new benchmarks: KnowRecall and VisRecall. KnowRecall is a visual question answering benchmark designed to measure factual knowledge consistency across 15 languages, focusing on cultural and historical questions about global landmarks. VisRecall assesses visual memory consistency by having the model describe the appearance of landmarks in nine languages without access to the images. Experimental results demonstrate that even state-of-the-art MLLMs, including proprietary models, struggle to achieve cross-language consistency. This highlights the need for a more robust approach to generating truly multilingual and culturally aware models.