This study aims to improve the performance of large-scale language models (LLMs) through test-time scaling, focusing particularly on the efficiency of the Best-of-N (BoN) sampling technique. To address the GPU memory usage and reward model limitations of BoN sampling, we propose Self-Truncation Best-of-N (ST-BoN), which leverages the initial consistency of the model's internal state to identify optimal paths and prune inefficient paths without fully generating all N samples. ST-BoN reduces computational costs by 70-80% while maintaining the same performance as Full-BoN, and can improve accuracy by 3-4 points at the same cost.