Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs

Created by
  • Haebom

Author

Yuchen Fu, Zifeng Cheng, Zhiwei Jiang, Zhonghui Wang, Yafeng Yin, Zhengliang Li, Qing Gu

Outline

In this paper, we propose a novel technique for extracting sentence embeddings from large-scale language models (LLMs), Token Prepending (TP). Existing methods induce LLMs to encode sentence information in the embeddings of the last tokens through prompt engineering, but this leads to biased encoding and cascading effects due to causal attention, which prevents early tokens from referring to later tokens. TP prepends the decoded sentence embeddings of each layer to the input sentences of the next layer, so that early tokens can pay attention to the entire sentence information. It is a plug-and-play, training-free technique that can be seamlessly integrated with various prompt-based sentence embedding methods and autoregressive LLMs. Through extensive experiments on various semantic text similarity (STS) tasks and subclassification tasks, we demonstrate that TP significantly improves the performance of existing methods while barely increasing the inference cost.

Takeaways, Limitations

Takeaways:
A novel TP technique is proposed to improve the performance of LLM-based sentence embedding extraction.
Easy to integrate with existing prompt-based methods in a plug-and-play manner.
No learning required, virtually no additional costs.
Experimentally validate performance improvements on various STS tasks and subclassification tasks.
Limitations:
Further studies are needed to determine whether the effectiveness of the TP technique presented in this paper generalizes to all LLMs and all prompt-based sentence embedding methods.
The applicability and effectiveness analysis for LLM using different types of attention mechanisms is needed.
There is a need to evaluate the efficiency and performance of TP techniques for extremely long sentences.
👍