Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Revisiting Residual Connections: Orthogonal Updates for Stable and Efficient Deep Networks

Created by
  • Haebom

Author

Giyeong Oh, Woohyun Cho, Siyeol Kim, Suhwan Choi, Younjae Yu

Outline

This paper proposes Orthogonal Residual Update (ORU) to overcome the limitations of conventional residual connections. While conventional methods tend to directly add the output of a module to the input stream, reinforcing or adjusting the existing direction, the proposed method induces the module to learn a new representation direction by adding only orthogonal components to the input stream. This enables richer feature learning and efficient training. We experimentally demonstrate that our method improves generalization accuracy and training stability across various architectures, such as ResNetV2 and Vision Transformer, and across diverse datasets, such as CIFARs, TinyImageNet, and ImageNet-1k. For example, we improve the top-1 accuracy of ViT-B by 4.3% on ImageNet-1k.

Takeaways, Limitations

Takeaways:
A novel update strategy is proposed to overcome the Limitations of existing residual connections.
Improved generalization performance and training stability across a variety of architectures and datasets.
Presenting richer feature learning possibilities through new expression direction learning.
We confirmed substantial performance improvements, including a 4.3% increase in ViT-B's top-1 accuracy on ImageNet-1k.
Limitations:
Absence of a clear analysis of whether the proposed method increases computational costs.
Further research is needed on scalability to various other architectures and datasets.
The need to optimize and improve the efficiency of the orthogonal component extraction process
👍