Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs

Created by
  • Haebom

Author

Yan Scholten, Sophie Xhonneux, Leo Schwinn, Stephan G unnemann

Outline

Existing unlearning (information removal) methods for large-scale language models (LLMs) optimize models by including the information to be removed in the fine-tuning data, which risks exposing sensitive data and violates the principle of minimal use. To address this, this paper proposes Partial Model Collapse (PMC), a novel method that does not include the unlearning objective in the unlearning objective. PMC exploits the phenomenon of model collapse (distribution collapse) when training a generative model with its own output, resulting in the removal of information. PMC performs machine unlearning by intentionally inducing model collapse on the data to be removed. Theoretically, we demonstrate that PMC converges to the desired results, overcomes three key limitations of existing unlearning methods, and experimentally demonstrates that it more effectively removes private information from model output while maintaining general model utility.

Takeaways, Limitations

Takeaways:
Eliminates private information by exploiting model collapse without explicitly using the unlearning objective.
Overcoming Limitations of existing unlearning methods.
Maintaining the general utility of the model.
Significant advances in comprehensive unlearning methods that meet real-world privacy requirements.
Limitations:
No specific Limitations mentioned in the paper.
👍