In this paper, we present a novel evaluation method to assess the nonlocal visual reasoning ability of visual language models (VLMs). Nonlocal visual reasoning refers to reasoning that connects evidence collected from multiple regions of an image, and we classify it into three types: comparative perception, leapfrog search, and smooth visual search. Our experiments on state-of-the-art VLMs, including Gemini 2.5 Pro, Claude Vision 3.7, and GPT-o4-mini, show that these models barely surpass random accuracy on simple tasks for humans. This suggests that although VLMs perform well on primitive vision benchmarks, they lack key visual reasoning capabilities. This study provides a structured evaluation set to verify whether VLMs can perform human-like vision algorithms.