This paper uses an appropriate loss function such as cross entropy to assess the quality of probabilistic predictions of machine learning classifiers, and points out that it can be decomposed into two components: calibration error and refinement error. The calibration error evaluates the overall under/overconfidence, while the refinement error measures the ability to distinguish different classes. This paper presents a new variational formulation of the calibration-refinement decomposition, which provides a new perspective on post-calibration and allows fast estimation of different terms. Through this, we provide theoretical and experimental evidence that the calibration error and refinement error are not minimized simultaneously during training. Therefore, choosing the optimal epoch based on the validation loss leads to a suboptimal trade-off for both terms. To address this, this paper proposes a method (Refine... then Calibrate) that minimizes only the refinement error during training before minimizing the post-calibration error using standard techniques. This method is seamlessly integrated into all classifiers and consistently improves performance on a variety of classification tasks.