Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

FedP$^2$EFT: Federated Learning to Personalize PEFT for Multilingual LLMs

Created by
  • Haebom

Author

Royson Lee, Minyoung Kim, Fady Rezk, Rui Li, Stylianos I. Venieris, Timothy Hospedales

Outline

This paper focuses on federated learning (FL), which enables training multilingual large-scale language models (LLMs) using diverse and distributed multilingual data, especially for low-resource languages. Personalization using parameter-efficient fine-tuning (PEFT) modules, such as LoRA, is commonly used to improve client-specific performance. This involves personalization strategies (PSs), such as designing PEFT adapter structures (e.g., layers to add LoRA and their ranks) and selecting hyperparameters for fine-tuning (e.g., learning rates). Instead of manually configuring PSs, this paper proposes FedP²EFT, a federated learning-personalization method for multilingual LLMs in a cross-device FL setting. FedP²EFT jointly learns an optimal personalized PEFT structure for each client via Bayesian sparse rank selection. Evaluations on simulated and real multilingual FL benchmarks demonstrate that FedP²EFT significantly outperforms existing personalized fine-tuning methods and complements other existing FL methods.

Takeaways, Limitations

Takeaways:
We present a novel federated learning-based personalization method (FedP²EFT) to improve client-specific performance of multilingual LLMs.
Efficiently learning the optimal PEFT structure and mitigating overfitting problems in low-data environments through Bayesian sparse rank selection.
Superior performance compared to existing methods is verified on simulated and real datasets.
Suggesting the possibility of complementation with various FL methods.
Ensuring reproducibility and scalability through open source code disclosure.
Limitations:
The performance of the proposed method may depend on the specific dataset and LLM architecture.
Further research is needed on generalization performance in real multilingual environments.
Bayesian sparse rank selection can be computationally expensive.
More comprehensive experiments on multilingual datasets of various sizes and characteristics are needed.
👍