This paper studies the Euclidean geometry problem as a surrogate task for solving spatial intelligence, which encompasses various abilities such as visual shape transformation, object rotation, relative position judgment, and numerical estimation, in multimodal large-scale language models (MLLMs). We constructed the Euclid30K multimodal dataset consisting of approximately 30,000 plane and three-dimensional geometric problems, and fine-tuned the Qwen2.5VL and RoboBrain2.0 models using Group Relative Policy Optimization (GRPO). As a result, the models showed zero-shot performance improvements on four spatial inference benchmarks (Super-CLEVR, Omni3DBench, VSI-Bench, and MindCube) after training on Euclid30K without any separate task-specific adaptation. In particular, the average accuracy of all models on VSI-Bench increased by 5.5 percentage points, from 34.5% to 40.5%, and the RoboBrain2.0-Euclid-7B model achieved an accuracy of 49.6%, outperforming the previous best-performing model, Spatial-MLLM. This study systematically demonstrates for the first time that geometry-focused fine-tuning can impart broadly transferable spatial skills to vision-language models.