Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

WASP: A Weight-Space Approach to Detecting Learned Spuriousness

작성자
  • Haebom

Author

Cristian Daniel P\u{a}duraru, Antonio B\u{a}rb\u{a}lau, Radu Filipescu, Andrei Liviu Nicolicioiu, Elena Burceanu

Outline

This paper emphasizes the importance of training machine learning models to clearly understand the factors that define each class. Previous studies have focused on identifying spurious correlations in datasets by relying solely on data or error analysis, but have failed to detect spurious correlations learned by models that are not revealed by counterexamples in the validation or training sets. To overcome these limitations, this paper proposes WASP (Weight-space Approach to Detecting Spuriousness), a novel method that analyzes the model's weights, the decision-making mechanism, rather than analyzing the model's predictions. WASP analyzes how the base model's weights shift in a direction that captures various (spurious) correlations during fine-tuning on a specific dataset. Unlike previous studies, WASP (i) exposes spurious correlations in datasets that are not revealed by training or validation counterexamples, (ii) works across various modalities, such as images and text, and (iii) demonstrates its ability to detect previously unknown spurious correlations learned by the ImageNet-1k classifier.

Takeaways, Limitations

Takeaways:
Weight analysis of the model can identify spurious correlations that traditional methods would not detect.
It can be applied to various modalities such as images and text.
It may discover new, previously unknown, spurious correlations.
Limitations:
Further experiments and analysis are needed to investigate the performance and generalization ability of WASP.
Further research is needed to determine whether all types of spurious correlations can be perfectly identified.
Weight analysis of complex models can be computationally expensive.
👍