Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Gradient-Based Model Fingerprinting for LLM Similarity Detection and Family Classification

Created by
  • Haebom

Author

Zehao Wu, Yanjie Zhao, Haoyu Wang

Outline

In this paper, we present TensorGuard, a gradient-based fingerprinting framework, to address the problem of unauthorized derivative model generation in large-scale language models (LLMs). TensorGuard extracts model-specific behavioral signatures by analyzing gradient responses to random input perturbations of tensor layers, without relying on training data, watermarks, or specific model formats. This enables similarity assessment between arbitrary models and systematic lineage classification of unknown models. It supports the safetensors format and generates high-dimensional fingerprints through statistical analysis of gradient features. Experimental results using 58 models (8 base models and 50 derivative models) show a classification accuracy of 94%.

Takeaways, Limitations

Takeaways:
Presenting a new technology that contributes to solving the problem of unauthorized derivation and redistribution of LLM.
Provides an effective mechanism for tracking model lineage and verifying license compliance.
Model similarity detection and lineage classification independent of training data, watermarks, and model formats.
Achieving high classification accuracy for various LLM series.
Limitations:
The types and number of models used in the experiments may be limited. Testing of more diverse and broader models is needed.
Further research is needed to determine how well TensorGuard's performance generalizes to new LLM architectures or derivatives.
Further evaluation of applicability and effectiveness in real-world LLM deployment environments is needed.
Further research is needed to determine whether it is vulnerable to extremely sophisticated modifications or attacks.
👍