Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

FedSEA-LLaMA: A Secure, Efficient and Adaptive Federated Splitting Framework for Large Language Models

Created by
  • Haebom

Author

Zishuai Zhang, Hainan zhang, Weihua Li, Qinnan zhang, jin Dong, Yongxin Tong, Zhiming Zheng

Outline

This paper proposes FedSEA-LLaMA, a transformer-based federated partitioning model for federated learning environments, to leverage private data to improve the performance of large-scale language models (LLMs) while addressing data silos and high computational demands. FedSEA-LLaMA ensures data privacy by distributing most model parameters to servers (or distributed clients), with only a small portion maintained on clients. To address the limitations of existing federated partitioning models, such as the vulnerability of P2P encryption, high communication overhead due to sequential learning and inference, and the problem of fixed split points, we propose secure vector transmission via Gaussian noise injection, reduced communication costs through attention mask compression and KV cache collaboration, and dynamic split point adjustment by the user. Experimental results on natural language understanding, summarization, and conversational question-answering tasks demonstrate that FedSEA-LLaMA achieves up to an eightfold increase in training and inference speed compared to centralized LLaMA2 without any performance degradation. Furthermore, we demonstrate its security and adaptability through privacy attack analysis and analysis of various split points.

Takeaways, Limitations

Takeaways:
We present a novel federated learning framework that simultaneously improves LLM performance and guarantees data privacy by leveraging personal data.
Effectively solves three major Limitations (P2P encryption, sequential processing, fixed split point) issues of existing federated partitioning models.
Achieves performance comparable to centralized LLaMA2 and up to 8x speedup across a variety of tasks, including natural language understanding, summarization, and conversational question-answering.
Adaptability to specific job requirements through dynamic split point adjustment.
Limitations:
Further research is needed to determine the applicability and scalability of the proposed method to real-world environments.
Robustness evaluation is required for various data distributions and network environments.
Further research is needed on performance degradation due to Gaussian noise injection and optimal noise level settings.
Optimization research for specific hardware environments is needed.
👍