This paper presents PyVision, an interactive, multi-turn framework to overcome the limitations of large-scale language models (LLMs) in visual reasoning. PyVision enables flexible and interpretable problem solving by allowing LLMs to autonomously generate, execute, and refine Python-based tools tailored to a given task. We develop a taxonomy of tools generated by PyVision and analyze their use across various benchmarks. Experimental results demonstrate that PyVision achieves consistent performance gains, including a 7.8% improvement in V* performance on GPT-4.1 and a 31.1% improvement in VLMsAreBlind-mini performance on Claude-4.0-Sonnet. This suggests that dynamic tool utilization enables models to go beyond simply using tools to invent them, leading to more autonomous visual reasoning.