This paper presents a method for discovering and aligning features across model checkpoints using sparse crosscoders to understand when and how specific language abilities emerge during pretraining of large-scale language models (LLMs). We aim to overcome the limitations of existing benchmarking approaches and understand model training at a conceptual level. Specifically, we train crosscoders across three pairs of open-source checkpoints with significant performance and representational variation and introduce a novel metric, the relative indirect effect (RelIE), to track the training phases at which individual features become causally important for task performance. We demonstrate that this allows for the detection of feature emergence, retention, and disruption during pretraining. This architecture-independent and highly scalable method offers a promising path toward interpretable and fine-grained analysis of representation learning across pretraining.