Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

DATABench: Evaluating Dataset Auditing in Deep Learning from an Adversarial Perspective

Created by
  • Haebom

Author

Shuo Shao, Yiming Li, Mengren Zheng, Zhiyang Hu, Yukun Chen, Boheng Li, Yu He, Junfeng Guo, Dacheng Tao, Zhan Qin

Outline

This paper studies dataset auditing techniques to address privacy and copyright issues stemming from the lack of transparency in datasets used in deep learning model training. We analyze the vulnerabilities of existing dataset auditing techniques to adversarial attacks and propose a new classification system that categorizes them into internal feature (IF) and external feature (EF)-based methods. Furthermore, we define two major attack types: evasion attacks, which conceal dataset usage, and forgery attacks, which falsely claim unused datasets. We propose systematic attack strategies for each type (separation, removal, and detection for evasion attacks; adversarial example-based methods for forgery attacks). Finally, we present a new benchmark, DATABench, comprised of 17 evasion attacks, five forgery attacks, and nine representative auditing techniques. Our evaluation results demonstrate that existing auditing techniques are not sufficiently robust or discriminatory in adversarial environments.

Takeaways, Limitations

Takeaways:
We systematically analyze the vulnerability of dataset auditing techniques to adversarial attacks and provide a new benchmark, DATABench, suggesting future research directions.
We propose a new classification system that classifies existing audit techniques based on internal features (IF) and external features (EF).
We present a systematic attack strategy against evasion and forgery attacks.
Limitations:
Although it demonstrates that current dataset auditing techniques are vulnerable to adversarial attacks, it does not offer specific solutions for developing more robust and reliable auditing techniques.
The types of attacks and auditing techniques included in DATABench may be limited. Future benchmarks should be expanded to include a wider range of attacks and auditing techniques.
👍