Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LLMs are Single-threaded Reasoners: Demystifying the Working Mechanism of Soft Thinking

Created by
  • Haebom

Author

Chunhung Wu, Jinliang Lu, Zixuan Ren, Gangqiang Hu, Zhi Wu, Dai Dai, Hua Wu

Outline

This paper investigates the "soft thinking" capabilities of large-scale language models (LLMs) using various exploration techniques. Soft thinking aims to generate soft tokens to facilitate inference within a continuous concept space. Contrary to conventional wisdom, we find that LLMs rely primarily on the most influential components of soft inputs during subsequent decoding, hindering the exploration of diverse inference paths. To overcome this limitation, we introduce randomness through sampling strategies such as Dirichlet resampling and the Gumbel-Softmax technique, aiming to unlock the potential of soft thinking. Experimental results show that the Gumbel-Softmax technique outperforms the Gumbel-Softmax technique on eight inference benchmarks.

Takeaways, Limitations

Takeaways:
Provides an in-depth understanding of soft thinking skills in LLM.
Identifying the Limitations (limited exploration path) of the existing soft thinking method.
Suggesting the possibility of improving soft thinking performance by introducing randomness.
Verification of the effectiveness of the Gumbel-Softmax technique.
Limitations:
Use of limited benchmarks (8).
Further research on other sampling strategies is needed.
Generalizability verification for various LLM architectures is needed.
👍