This paper presents a novel task, CAPTURe (Counting Amodally for Patterns Through Unseen Regions), to evaluate the ability of a model to infer patterns hidden behind occluded regions. CAPTURe requires the model to count objects by inferring patterns hidden behind occluded regions, assessing both visual pattern recognition and inference. It consists of two versions: CAPTURe-real, which uses real object images, and CAPTURe-synthetic, which uses generated images. We evaluated four powerful VLMs—GPT-4o, Intern-VL2, Molmo, and Qwen2-VL—and found that they performed poorly on both occluded and unoccluded patterns, and that their performance deteriorated even more when occluded. This suggests that VLMs struggle to infer unseen spatial relationships. In contrast, humans showed very low error rates on CAPTURe. Providing additional information about the location of occluded objects improved performance, suggesting that the model's errors stem from both its inability to handle occlusion and its difficulty counting within the image.