This paper explores why Tool-Integrated Inference (TIR) improves the performance of large-scale language models (LLMs). While LLMs integrated with tools such as Python code interpreters show great promise, a principled theory explaining the effectiveness of this paradigm has been lacking. This study is the first to formally demonstrate that TIR fundamentally extends the capabilities of LLMs. By rigorously extending the model's empirical and feasible support, the tool overcomes the performance limitations of purely textual models by enabling problem-solving strategies that would otherwise be impossible or intractably tedious. To guide model behavior without compromising learning stability and performance, this paper presents Advantage Shaping Policy Optimization (ASPO), a novel algorithm that directly modifies the advantage function to guide policy actions. We conduct comprehensive experiments on challenging mathematical benchmarks using the Python interpreter as an external tool. Our experiments demonstrate that the TIR model clearly outperforms the purely textual model in terms of pass@k. Importantly, this advantage extends beyond computationally intensive problems to problems requiring significant abstract insight. We also identify novel cognitive patterns that demonstrate how the model uses tools to think. Finally, we report improved tool usage behavior through initial code invocation and significantly more interactive turns using ASPO. Overall, this study provides a first-principled explanation for the success of TIR, shifting the focus from the simple fact that the tool works to why and how it enables more powerful inferences.