Mathematical inference of large-scale language models (LLMs) requires significant computational resources and time due to their long generation times. Existing efficient inference methods maintain excellent performance on language tasks, but often significantly degrade mathematical performance. In this paper, we propose Caprese, a resource-efficient distillation method for recovering the mathematical power lost by applying efficient inference methods, focusing specifically on the feedforward block. Using only approximately 1% of additional parameters and 20,000 synthetic training samples, without changing the original weights, Caprese significantly recovers the lost mathematical power through efficient inference. Furthermore, Caprese reduces the number of active parameters (approximately 2 billion for Gemma 2.9B and Llama 3.1.8B) and seamlessly integrates into existing model layers, promoting response compactness (up to 8.5% fewer tokens) and reducing latency (by more than 16%, the time to the next token).