To study the exploration and exploitation (E&E) strategies of large-scale language models (LLMs), we use a classic multi-arm bandit (MAB) experiment introduced in the cognitive science and psychiatry literature. We compare the E&E strategies of LLMs, humans, and MAB algorithms, and investigate how activating thought traces through prompting strategies and mental models affects LLMs' decision-making. Our results show that activating thought leads to human-like behavioral changes in LLMs, demonstrating human-like levels of exploration in simple environments. However, in more complex, unstable environments, LLMs fail to match human adaptability in effective directed exploration.