This paper presents MaRVL-QA, a novel benchmark for evaluating the mathematical and spatial reasoning capabilities of multimodal large-scale language models (MLLMs). MaRVL-QA is designed to assess reasoning capabilities purely, without semantic noise, using mathematical surface plots. It consists of two novel tasks: topological computation, which identifies and enumerates features such as local maxima, and transformation recognition, which recognizes geometric transformations. Experimental results show that even state-of-the-art MLLMs tend to rely on superficial heuristics instead of robust spatial reasoning. MaRVL-QA will contribute to research aimed at improving the reasoning capabilities of MLLMs.