[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

DeInfoReg: A Decoupled Learning Framework for Better Training Throughput

Created by
  • Haebom

Author

Zih-Hao Huang, You-Teng Lin, Hung-Hsuan Chen

Outline

In this paper, we propose a novel approach, Decoupled Supervised Learning with Information Regularization (DeInfoReg), which transforms long gradient streams into multiple short streams to alleviate the vanishing gradient problem. By incorporating a pipelining strategy, DeInfoReg enables model parallelization across multiple GPUs, which significantly improves training throughput. In this paper, we compare the proposed method with standard backpropagation and other gradient stream decomposition techniques. Through extensive experiments on various tasks and datasets, we demonstrate that DeInfoReg achieves superior performance and improved noise resistance over conventional BP models, while efficiently utilizing parallel computing resources. The code for reproducibility is available at https://github.com/ianzih/Decoupled-Supervised-Learning-for-Information-Regularization/ .

Takeaways, Limitations

Takeaways:
A novel method to effectively alleviate the vanishing gradient problem is presented.
Improve training speed with GPU parallel processing.
It shows superior performance and noise resistance compared to existing methods.
Code disclosure for reproducibility.
Limitations:
Further studies are needed to determine the general applicability of the presented method.
Further experiments on different architectures and datasets are needed.
Complexity of implementing a pipeline strategy.
👍