In this paper, we present a novel method to address the catastrophic forgetting problem that occurs in supervised fine-tuning (SFT) methods for improving the instruction-following ability of open-source large-scale language models (LLMs). Without access to the original SFT data, we reconstruct the instruction distribution of the base model and synthesize a high-quality general-purpose dataset through a multi-model generation and filtering pipeline. By mixing this synthetic dataset with new domain-specific data and fine-tuning it, we experimentally demonstrate that it improves the performance of specific tasks without degrading the performance in the general domain.