Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Counterfactual Influence as a Distributional Quantity

Created by
  • Haebom

Author

Matthieu Meeus, Igor Shilov, Georgios Kaissis, Yves-Alexandre de Montjoye

Outline

This paper addresses the problem of overfitting in machine learning models, particularly the phenomenon of memorization, which raises concerns about privacy and generalization. The traditional memorization metric, counterfactual self-influence, quantifies how much a model’s predictions change depending on whether a particular sample is included in the training dataset. However, recent studies have shown that in addition to self-influence, other training samples, especially (nearly) duplicated samples, have a significant impact on memorization. In this paper, we study memorization by treating counterfactual self-influence as a distribution that considers the influence of all training samples on the memorization of a particular sample. Using a small-scale language model, we compute the overall influence distribution among all training samples and analyze its characteristics. We find that considering only self-influence can severely underestimate the practical risk associated with memorization. The presence of (nearly) duplicated samples significantly reduces self-influence, but such samples are found to be (nearly) extractable. We observe a similar pattern in CIFAR-10 image classification, where the presence of (nearly) duplicated samples can be identified by the influence distribution alone. In conclusion, we emphasize that memorization arises from complex interactions among training data, and that self-influence alone cannot capture memorization as well as the overall influence distribution.

Takeaways, Limitations

Takeaways:
This suggests that understanding the memorization phenomenon requires considering, in addition to self-influence, the influence of other training samples, especially (nearly) duplicate samples.
We demonstrate that full influence distribution analysis that takes into account complex interactions between training data is important for memorization research.
(Nearly) duplicate samples reduce self-influence, but increase extractability, suggesting a privacy risk.
It demonstrates the limitations of simply analyzing self-influence and suggests the need for a more comprehensive analysis method.
Limitations:
Only results for a small-scale language model and the CIFAR-10 dataset are presented, requiring further research on generalizability.
Additional experiments with models of different sizes and types are needed to verify generalizability.
Computing the full influence distribution can be computationally expensive, and efficient methods are needed to apply it to large datasets.
👍