Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Static Word Embeddings for Sentence Semantic Representation

Created by
  • Haebom

Author

Takashi Wada, Yuki Hirakawa, Ryotaro Shimizu, Takahiro Kawashima, Yuki Saito

Outline

This paper proposes a novel static word embedding optimized for representing sentence meaning. Word embeddings are extracted from a pre-trained Sentence Transformer, refined through sentence-level principal component analysis, and then knowledge distillation or contrastive learning is applied. During inference, sentences are represented by simply averaging the word embeddings, which requires minimal computational overhead. Evaluating the model on monolingual and cross-lingual tasks, we demonstrate that the proposed model significantly outperforms existing static models on sentence semantic tasks, even outperforming a basic Sentence Transformer model (SimCSE). Furthermore, our analysis demonstrates that the proposed methodology successfully removes irrelevant word embedding components and adjusts the vector norm based on the word's influence on sentence meaning.

Takeaways, Limitations

Takeaways:
We leveraged the Sentence Transformer to extract word embeddings and improved them through sentence-level principal component analysis, knowledge distillation, or contrastive learning to enhance sentence semantic representation performance.
Representing sentences with only a simple word embedding average results in fast inference and low computational cost.
It showed better performance than existing static models and basic Sentence Transformer models.
We improved the embedding quality by removing word embedding components that are irrelevant to the sentence meaning and adjusting the vector norm based on the influence of the word.
Limitations:
The specific Limitations was not stated in the abstract.
👍