In this paper, we present a Zero-shot Tool-Integrated Reasoning (ZeroTIR) methodology that uses reinforcement learning (RL) to enable large-scale language models (LLMs) to spontaneously utilize external tools (Python code execution) to enhance their mathematical problem-solving abilities. The key is to train the LLM to generate and execute Python codes by applying RL with outcome-based rewards, without supervised tool usage examples. Experimental results show that the frequency of spontaneous code execution, response length, and final accuracy all increase positively with increasing RL training steps, suggesting a quantitative relationship between training effort and the acquisition of effective tool utilization strategies. We implement a robust framework using standard RL algorithms and frameworks, and demonstrate that it outperforms existing methods.