Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

NeMo: A Neuron-Level Modularizing-While-Training Approach for Decomposing DNN Models

Created by
  • Haebom

Author

Xiaohan Bi, Binhang Qi, Hailong Sun, Xiang Gao, Yue Yu, Xiaojun Liang

Outline

This paper focuses on modularizing deep neural networks (DNNs) to address the problem of increased inference overhead associated with full model reuse, aiming to reduce costs through DNN model reuse. Specifically, we focus on modularizing while-training (MwT) methods, which outperform existing modularization-after-training methods. We propose NeMo, a scalable MwT approach applicable to large-scale models and diverse DNN architectures (especially Transformer-based models). NeMo performs modularization at the neuron level and designs a modular training method based on contrastive learning and a complex loss function for large-scale model applications. Experimental results on two Transformer-based models and four CNN models demonstrate that NeMo improves module classification accuracy by an average of 1.72% and reduces module size by 58.10% compared to existing state-of-the-art MwT methods. Case studies using open-source projects demonstrate the practical benefits of NeMo.

Takeaways, Limitations

Takeaways:
We present a general MwT approach applicable to various DNN architectures (CNN, Transformer, etc.) and large-scale models through neuron-level modularization.
It shows scalable performance even in large-scale models through an effective modular training method based on contrastive learning and a composite loss function.
Suggesting the possibility of efficient reuse and cost reduction of DNN models by improving module classification accuracy and reducing module size.
Demonstrates the applicability of real open source projects.
Limitations:
The experiments presented in this paper are limited to specific datasets and models, and additional verification of generalization performance on other datasets or models is required.
There is a lack of analysis of the additional computational costs or memory requirements that may arise when implementing and applying NeMo in practice.
Because neuron-level modularity does not always guarantee optimal performance, further research is needed to optimize modularization strategies.
👍