Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LLMs Have a Heart of Stone: Demystifying the Soft Thinking Ability of Large Reasoning Models

Created by
  • Haebom

Author

Chunhung Wu, Jinliang Lu, Zixuan Ren, Gangqiang Hu, Zhi Wu, Dai Dai, Hua Wu

Outline

This paper analyzes the "soft thinking" capabilities of large-scale language models (LLMs) using various exploration techniques. Contrary to conventional expectations about soft thinking, we find that LLMs rely primarily on the most influential components of soft tokens, limiting their inference path exploration. This is similar to greedy decoding, which obscures the advantage of conveying more information through soft tokens. To address this issue, we introduce randomness through sampling strategies such as Dirichlet resampling and the Gumbel-Softmax technique, and experimentally verify their effectiveness across eight inference benchmarks. We confirm that the Gumbel-Softmax technique achieves the best performance by providing appropriate randomness and controlled smoothness.

Takeaways, Limitations

Takeaways: We demonstrate that soft reasoning using soft tokens can be reduced to simple greedy decoding, and suggest that performance can be improved by introducing randomness in sampling strategies (specifically, Gumbel-Softmax). This deepens our understanding of the LLM inference process and suggests ways to effectively utilize soft reasoning.
Limitations: The effectiveness of the proposed sampling strategy may be limited to a specific benchmark, and its generalizability to other types of LLM or inference tasks requires further research. Furthermore, introducing randomness does not always lead to improved performance, and determining the optimal level of randomness remains a challenge.
👍