This paper presents the Dynamic Reasoning-Boundary Self-Awareness Framework (DR. SAF) to address the efficiency challenges of large-scale language models (LLMs) that benefit from long chains of thought (CoTs) in complex reasoning tasks. DR. SAF integrates three core components: Boundary Self-Awareness Alignment, Adaptive Reward Management, and a Boundary Preservation Mechanism, enabling the model to dynamically assess and adjust its inference depth based on the complexity of the problem. Experimental results show that DR. SAF reduces the number of response tokens by 49.27%, improves token efficiency by 6.59x, and shortens training time by 5x, all while minimizing accuracy degradation. Notably, under extreme training conditions, it improves token efficiency by over 16% and achieves higher accuracy than conventional instruction-based models.