This paper analyzes the "soft thinking" capabilities of large-scale language models (LLMs) using various exploration techniques. Contrary to conventional expectations about soft thinking, we find that LLMs rely primarily on the most influential components of soft tokens, limiting their inference path exploration. This is similar to greedy decoding, which obscures the advantage of conveying more information through soft tokens. To address this issue, we introduce randomness through sampling strategies such as Dirichlet resampling and the Gumbel-Softmax technique, and experimentally verify their effectiveness across eight inference benchmarks. We confirm that the Gumbel-Softmax technique achieves the best performance by providing appropriate randomness and controlled smoothness.