Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

EcoTransformer: Attention without Multiplication

Created by
  • Haebom

Author

Xin Gao, Xingming Xu, Shirin Amiraslani, Hong Xu

Outline

This paper proposes EcoTransformer, a novel Transformer architecture, to address the high computational complexity and energy consumption of the extended dot-product attention mechanism of the existing Transformer. EcoTransformer generates output context vectors through convolution with a Laplacian kernel, and the distance between queries and keys is measured using the L1 metric. Unlike dot-product-based attention, EcoTransformer eliminates matrix multiplication, significantly reducing computational complexity. It performs similarly or better than the existing extended dot-product attention in NLP, bioinformatics, and vision tasks, while significantly reducing energy consumption.

Takeaways, Limitations

Takeaways:
We present a new architecture that effectively addresses the high computational load and energy consumption issues of existing Transformers.
It demonstrates results that maintain or surpass existing performance in various fields such as NLP, bioinformatics, and vision.
It can make a significant contribution to the development of energy-efficient AI models.
Limitations:
Further verification of the generalizability of the experimental results presented in this paper is needed.
Limitations of using the Laplacian kernel and L1 metric and comparative analysis with other distance measures are needed.
Further performance evaluations are needed for models of varying sizes and complexities.
👍