This paper demonstrates that rule-based reinforcement learning (RL) significantly improves the inference performance of large-scale language models (LLMs), but the underlying mechanisms remain unclear. We find that small-scale supervised fine-tuning (SFT) significantly impacts RL but is inefficient, and propose an analytical framework to explain this. We compare the efficiency of SFT and RL by measuring the sampling effect and suggest the possibility of improving SFT's efficiency. Based on this analysis, we propose a "re-distillation" technique that samples from RL-trained policies to enhance the effectiveness of small-scale distillation. On three datasets and the Qwen & Llama model, we demonstrate that the re-distillation model achieves RL performance with significantly fewer samples and computations. On the K & K dataset, the re-distilled Qwen-2.5-1.5B model outperforms DeepSeek-V3-0324 with only 1K SFT samples. Furthermore, we demonstrate that redistillation can be used to efficiently balance multiple objectives in RL, and explain several interesting phenomena in R1-style RL, revealing the mechanisms behind its empirical success.