Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Uncovering Conceptual Blindspots in Generative Image Models Using Sparse Autoencoders

Created by
  • Haebom

Author

Matyas Bohacek, Thomas Fel, Maneesh Agrawala, Ekdeep Singh Lubana

Outline

In this paper, we present a systematic analysis of the phenomenon that generative image models trained on large datasets frequently fail to generate images of simple concepts that are expected to be present in the training data, such as hands or groups of four objects. We extract interpretable concept embeddings using sparse autoencoders (SAEs) and quantitatively compare concept frequencies between real and generated images. In particular, we analyze the conceptual gaps of four popular generative models, including Stable Diffusion 1.5/2.1, PixArt, and Kandinsky, using a large-scale RA-SAE trained on 32,000 concepts using DINOv2 features. As a result, we detect omissions (e.g., bird feeder, DVD disc, margins of a document) and exaggerations (e.g., wood background texture, palm tree) of specific concepts, and isolate memorization artifacts that reproduce specific visual templates that the models saw during training at the level of individual data points. In conclusion, we propose a theoretically supported framework for systematically identifying conceptual blind spots in generative models by assessing the conceptual fidelity of the underlying data generation process.

Takeaways, Limitations

Takeaways:
We present a novel methodology to systematically analyze and quantify conceptual blind spots in generative image models.
We clearly reveal the limitations of generative models by extracting and comparing concept embeddings using sparse autoencoders (SAE).
Specific concepts of omission and exaggeration are presented through experiments on various generative models (Stable Diffusion, PixArt, Kandinsky).
Identify memory artifacts in the model to enhance understanding of the model's learning process.
Suggesting directions for improving the performance and ensuring reliability of generative models.
Limitations:
Further validation is needed to ensure that the concept embeddings of the SAE used are fully interpretable.
The type and number of concepts used in the analysis can affect the evaluation of model performance.
Since this is an analysis result for a specific dataset, further research is needed on generalizability to other datasets.
Further validation of the accuracy and objectivity of memory artifact identification is needed.
👍