Fine-tuning large-scale language models (LLMs) with narrowly detrimental datasets can lead to behaviors that are broadly inconsistent with human values. To understand when and how this emerging inconsistency arises, we developed a comprehensive framework for detecting and characterizing rapid transitions during fine-tuning, utilizing both distributional shift detection methods and order parameters formulated in plain English and evaluated by LLM judges. Using objective statistical similarity measures, we quantified how the phase transitions that occur during fine-tuning affect different aspects of the model. Specifically, we assessed what percentage of the total distributional change in model output is captured by different aspects, such as alignment or verbosity, providing a decomposition of the overall transition. We also found that actual behavioral transitions occur later in training, rather than being solely reflected in the peak of the gradient norm. Our framework enables the automatic discovery and quantification of language-based order parameters, demonstrated across a variety of examples ranging from knowledge questions to politics and ethics.