Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Reconstruction of Differentially Private Text Sanitization via Large Language Models

Created by
  • Haebom

Author

Shuchao Pang, Zhigang Lu, Haichen Wang, Peng Fu, Yongbin Zhou, Minhui Xue

Outline

This paper demonstrates that large-scale language models (LLMs) can reconstruct personal information even from texts subjected to differential privacy (DP) techniques. The researchers propose two attacks, black-box and white-box, depending on the accessibility of the LLM. They experimentally demonstrate the link between DP-processed text and the training data for privacy-preserving LLMs. Experiments were conducted on word- and sentence-level DPs using various LLMs, including LLaMA-2, LLaMA-3, and ChatGPT, as well as datasets such as WikiMIA and Pile-CC, and the results confirmed high reconstruction success rates. For example, black-box attacks on word-level DP on the WikiMIA dataset achieved success rates of 72.18% for LLaMA-2 (70B), 82.39% for LLaMA-3 (70B), 91.2% for ChatGPT-4o, and 94.01% for Claude-3.5. This reveals the security vulnerabilities of existing DP techniques and suggests that LLMs themselves pose a new security threat.

Takeaways, Limitations

Takeaways:
Revealing the limitations of existing differential privacy (DP) techniques.
We suggest that large-scale language models (LLMs) could become a new avenue for personal information leakage.
The need for improvements in DP techniques and new defense strategies against attacks utilizing LLM is raised.
Generalizability is presented through experimental results on various LLMs and datasets.
Limitations:
Further research is needed to determine the effectiveness of the proposed attack technique and its applicability in real-world environments.
There is a need to develop more powerful DP techniques or defense techniques against LLM.
Since these are experimental results for a specific LLM and dataset, there are limitations in generalizing them to other LLMs or datasets.
Lack of analysis and verification of actual cases of personal information leaks.
👍