In this paper, we analyze the advantages and disadvantages of supervised fine-tuning (SFT) and reinforced fine-tuning (RFT), which are post-training techniques for large-scale language models (LLMs), and propose a new method, Prefix-RFT, which integrates them. SFT has excellent imitation ability but has difficulty in generalization, and RFT is effective in improving performance but has limitations in learning unexpected behaviors and being sensitive to the initial policy. Prefix-RFT combines the advantages of SFT and RFT to perform demonstration data learning and exploratory learning simultaneously, and demonstrates that it outperforms SFT, RFT, and parallel mixed-policy RFT through experiments using mathematical inference problems. In addition, it can be easily integrated into existing open source frameworks, and its robustness to the quality and quantity of demonstration data is also confirmed.