Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

FediLoRA: Heterogeneous LoRA for Federated Multimodal Fine-tuning under Missing Modalities

Created by
  • Haebom

Author

Lishan Yang, Wei Emma Zhang, Nam Kha Nguygen, Po Hu, Yanjun Shu, Weitong Chen, Mong Yuan Sim

Outline

This paper presents the FediLoRA framework, a parameter-efficient fine-tuning (PEFT) method for distributed environments, specifically to address the Limitations of low-rank adaptation (LoRA). While existing federated learning-based LoRA methods assume a homogeneous rank configuration and a single modal input, FediLoRA breaks this assumption and considers the realistic challenges of heterogeneous client resources (different LoRA ranks) and missing modal information. FediLoRA rebalances LoRA update weights without information dilution through a dimension-wise aggregation strategy, and improves both client and global model performance by improving local components through a lightweight layer-wise model editing method. Experimental results using various modal benchmark datasets demonstrate that FediLoRA outperforms competing techniques, especially when modal information is incomplete.

Takeaways, Limitations

Takeaways:
We present an efficient multi-modal fine-tuning framework based on federated learning that considers heterogeneous client resources and modal information missing issues.
We show that performance can be improved without information loss by using a dimension-wise aggregation strategy and a hierarchical model editing method.
We achieve superior performance over existing methods on a variety of modal datasets, particularly when modal information is incomplete.
Limitations:
The effectiveness of the proposed method may vary depending on various experimental environments. Further extensive experimental validation is required.
Further research is needed to determine whether the currently presented method is applicable to all types of multimodal data.
There is a lack of analysis of the complexity and computational cost of layer-by-layer model editing methods.
👍