Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LLM Unlearning Without an Expert Curated Dataset

Created by
  • Haebom

Author

Xiaoyuan Zhu, Muru Zhang, Ollie Liu, Robin Jia, Willie Neiswanger

Outline

This paper explores a post-hoc unlearning technique that removes specific knowledge domains without retraining the entire model to address the problem of modern large-scale language models encoding sensitive, harmful, or copyrighted knowledge. A key bottleneck in conventional post-hoc unlearning processes is constructing effective "forget sets" that approximate the target domain and induce the model to forget about it. This paper presents a scalable and automated approach that generates high-quality "forget sets" using the language model itself. Textbook-style data is synthesized through a structured prompt pipeline, requiring only domain names as input. Experiments on unlearning on biosecurity, cybersecurity, and Harry Potter novels demonstrate that the synthetic dataset consistently outperforms existing synthetic datasets and performs on par with expert-curated datasets. Furthermore, ablation studies demonstrate that the multi-stage generation pipeline significantly enhances data diversity, thereby enhancing the utility of unlearning. In conclusion, this study presents synthetic datasets as a promising approach for practical and scalable unlearning in a variety of emerging domains without manual intervention. The code and dataset are publicly available at https://github.com/xyzhu123/Synthetic_Textbook .

Takeaways, Limitations

Takeaways:
We present an automated method for generating 'forget sets' using the language model itself, improving the efficiency and scalability of the post-training pruning process.
Synthetic datasets outperform existing methods and produce results comparable to expert-curated datasets.
Increase data diversity and enhance the usability of learning-by-doing through a multi-stage generation pipeline.
Presenting the possibility of practical post-learning elimination for various domains.
Limitations:
Further research is needed to determine the generalization performance of the proposed method. There is a possibility of overfitting to certain domains.
The quality of synthetic datasets may depend on the performance of prompt engineering and language models.
Potential performance degradation due to differences from real-world data.
Consideration should be given to the potential bias issues that may arise during the process of creating a 'forget set'.
👍