MSARL is a multi-agent reinforcement learning framework in which multiple small agents collaborate through division of labor. While existing tool-integrated inference systems involve a single, large model that mixes long-term inference with precise tool manipulation, resulting in cognitive overload and unstable coordination, MSARL explicitly separates inference and tool usage. The inference agent decomposes the problem and plans tool invocation, while multiple tool agents specialize in specific external tools and are trained through a combination of imitation learning and reinforcement learning with role-specific rewards. In mathematical problem solving, including code execution, MSARL significantly improves inference stability and final answer accuracy compared to single-agent baseline models. Furthermore, this architecture generalizes to various tool-using tasks, demonstrating that the separation of cognitive roles using small agents is a scalable blueprint for designing multi-agent AI.