This paper proposes a novel reinforcement learning (RL) framework for effective tool utilization of large-scale language models (LLMs). To address the challenges of building a stable training environment and designing a verifiable reward mechanism, which are inherent challenges in existing RL frameworks, we present an automated environment construction pipeline that encompasses scenario decomposition, document generation, feature aggregation, complexity tuning, and local deployment. This pipeline generates high-quality training environments that provide detailed and measurable feedback without relying on external tools. Furthermore, we introduce a verifiable reward mechanism that evaluates both the accuracy of tool utilization and the completeness of task execution, enabling seamless integration with standard RL algorithms. Experimental results on LLMs of various scales demonstrate that the proposed method significantly improves the model's tool utilization performance while maintaining general functionality. Our analysis suggests that this performance improvement is due to enhanced contextual understanding and inference capabilities, driven by updates to the model's lower-level MLP parameters.