In this paper, we present a systematic analysis of the phenomenon that generative image models trained on large datasets frequently fail to generate images of simple concepts that are expected to be present in the training data, such as hands or groups of four objects. We extract interpretable concept embeddings using sparse autoencoders (SAEs) and quantitatively compare concept frequencies between real and generated images. In particular, we analyze the conceptual gaps of four popular generative models, including Stable Diffusion 1.5/2.1, PixArt, and Kandinsky, using a large-scale RA-SAE trained on 32,000 concepts using DINOv2 features. As a result, we detect omissions (e.g., bird feeder, DVD disc, margins of a document) and exaggerations (e.g., wood background texture, palm tree) of specific concepts, and isolate memorization artifacts that reproduce specific visual templates that the models saw during training at the level of individual data points. In conclusion, we propose a theoretically supported framework for systematically identifying conceptual blind spots in generative models by assessing the conceptual fidelity of the underlying data generation process.