Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

FedRecon: Missing Modality Reconstruction in Heterogeneous Distributed Environments

Created by
  • Haebom

Author

Junming Liu, Yanting Gao, Yifei Sun, Yufei Jin, Yirong Chen, Ding Wang, Guosun Zeng

Outline

This paper proposes FedRecon, a federated learning (FL) method for multimodal data with incomplete and non-independent identical distributions (Non-IID) characteristics commonly encountered in real-world scenarios. It is the first method to simultaneously target missing modal reconstruction and non-IID adaptation. It utilizes a lightweight multimodal variational autoencoder (MVAE) to reconstruct missing modal data while maintaining inter-modal consistency, and a novel distribution mapping mechanism ensures data consistency and completeness. Furthermore, a global generator fixation strategy is introduced to prevent catastrophic forgetting and mitigate non-IID variations. Extensive evaluation on multimodal datasets demonstrates that FedRecon outperforms existing state-of-the-art methods in non-IID conditions.

Takeaways, Limitations

Takeaways:
We present a novel method to simultaneously solve the missing modal reconstruction and non-IID adaptation problems of multimodal data.
Ensures data consistency and completeness through lightweight MVAE and innovative distribution mapping mechanism.
Mitigating performance degradation due to non-IID fluctuations through a global generator fixation strategy.
We demonstrate superior modal reconstruction performance, outperforming existing best-performing methods under non-IID conditions.
Limitations:
Code disclosure is scheduled after the paper is accepted, so reproducibility verification is currently not possible.
Additional performance analysis for various types of Non-IID distributions is needed.
Only evaluation results for specific types of multimodal data are presented, requiring further validation of generalizability.
👍