This paper proposes Post-Completion Learning (PCL), a novel learning framework that utilizes the sequence space after the model output is completed, to overcome the limitation of existing language model training that terminates at the terminal token ( ). PCL enhances inference and self-evaluation capabilities by continuing to generate self-evaluations and reward predictions even after the model completes its output, while maintaining efficient inference by stopping at the completion point. This is achieved through a white-box reinforcement learning method, where the model evaluates outputs according to reward rules and supervises the scores by aligning them with the reward function. To optimize both inference and evaluation capabilities, we implement dual-track SFT and combine it with RL learning to achieve multi-objective hybrid optimization. Experimental results on various datasets and models demonstrate consistent performance improvements compared to existing SFT and RL methods.