Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MobiEdit: Resource-efficient Knowledge Editing for Personalized On-device LLMs

Created by
  • Haebom

Author

Zhenyan Lu, Daliang Xu, Dongqi Cai, Zexi Li, Wei Liu, Fangming Liu, Shangguang Wang, Mengwei Xu

Outline

In this paper, we present MobiEdit, a novel framework for knowledge editing of large-scale language models (LLMs) on mobile devices. Existing knowledge editing methods require resource-intensive backpropagation (BP), which makes them difficult to run on mobile devices. MobiEdit replaces backpropagation with forward-only quantized gradient estimation, thereby ensuring compatibility with energy-efficient mobile NPUs. In addition, we further improve the efficiency of gradient estimation by introducing an early stopping mechanism and a prefix cache. Experimental results show that MobiEdit enables real-time editing of a 3 billion-parameter model (Qwen2.5-3B-Instruct) with 7.6x less memory, 14.7x less energy, and 3.6x less latency compared to existing methods.

Takeaways, Limitations

Takeaways:
Providing a practical framework for LLM personalization on mobile devices
Efficient knowledge editing possible using energy-efficient mobile NPU
Achieving real-time knowledge editing with significantly less resource consumption than existing methods
Performance improvements through early abort and prefix cache optimizations
Limitations:
The performance of the proposed method may depend on the specific model (Qwen2.5-3B-Instruct) and mobile device environment. Generalization performance on other models or devices requires further study.
There is a possibility that the accuracy of knowledge editing may be reduced due to the reduced accuracy of quantized gradient estimation.
Currently, only experimental results for a 3 billion parameter model are presented, and applicability to larger models has not been verified.
👍