This paper investigates the "soft thinking" capabilities of large-scale language models (LLMs) using various exploration techniques. Soft thinking aims to generate soft tokens to facilitate inference within a continuous concept space. Contrary to conventional wisdom, we find that LLMs rely primarily on the most influential components of soft inputs during subsequent decoding, hindering the exploration of diverse inference paths. To overcome this limitation, we introduce randomness through sampling strategies such as Dirichlet resampling and the Gumbel-Softmax technique, aiming to unlock the potential of soft thinking. Experimental results show that the Gumbel-Softmax technique outperforms the Gumbel-Softmax technique on eight inference benchmarks.