This paper studies the self-improvement process, a technique for improving the performance of LLM without relying on external data. Specifically, we theoretically model the self-improvement training dynamics through the gap between the solver and verifier capabilities of LLM. This model allows us to model the entire training trajectory and quantify performance limitations based on experimental results. We validate the effectiveness of the theoretical framework on various LLMs and datasets, and analyze the impact of external data on final performance in a limited external data environment.