Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Feature Distillation is the Better Choice for Model-Heterogeneous Federated Learning

Created by
  • Haebom

Author

Yichen Li, Xiuying Wang, Wenchao Xu, Haozhao Wang, Yining Qi, Jiahua Dong, Ruixuan Li

Outline

Model-heterogeneous federated learning (Hetero-FL) has attracted attention for its ability to integrate knowledge from heterogeneous models while maintaining individual data locally. To better integrate knowledge from clients, ensemble distillation is a widely used and effective technique to improve the performance of the global model after global integration. However, simply combining Hetero-FL and ensemble distillation does not always yield promising results and can lead to instability in the learning process. This is because existing methods primarily focus on logit distillation using model-independent softmax predictions and fail to compensate for knowledge biases arising from heterogeneous models. To address these issues, this paper proposes a stable and efficient feature distillation method for model-heterogeneous federated learning, called FedFD. FedFD integrates aligned feature information through orthogonal projections to better integrate knowledge from heterogeneous models. Specifically, we propose a novel feature-based ensemble federated knowledge distillation paradigm. The server's global model must maintain a projection layer to individually align features for each client-side model architecture. Orthogonal techniques are used to mitigate knowledge bias from heterogeneous models and reparameterize the projection layer to maximize distilled knowledge. Extensive experiments have shown that FedFD outperforms state-of-the-art methods.

Takeaways, Limitations

Takeaways:
A novel Feature Distillation methodology (FedFD) for stable and efficient knowledge integration in model-heterogeneous federated learning is presented.
Leveraging orthogonal projections to mitigate knowledge bias between heterogeneous models and improve performance.
Proposing a novel feature-based ensemble federated knowledge distillation paradigm.
Demonstrated superior performance compared to existing SOTA methods.
Limitations:
A projection layer for each client model architecture must be maintained on the server (potentially increasing computational complexity).
Further research is needed on the specific degree of performance improvement and generalization performance in various environments.
Comparative analysis with other knowledge distillation methods not covered in this paper is needed.
👍