This paper presents an analysis of a simple test-time scaling technique that replicates the scaling behavior of models distilled from o1-like models by manually adjusting the test-time computational complexity. The analysis reveals that the scaling behavior is primarily due to scaling down via maximum length constraints. In contrast, fine-tuning with long CoT data does not significantly affect the scaling behavior, and scaling up via adding “Wait” is inconsistent as the model can oscillate between solutions. There is an important distinction between scaling down via maximum length constraints and scaling up test-time computational complexity in o1-like models (e.g., DeepSeek-R1). While o1-like models are allowed to use as much computational complexity as they need, they are limited only by the maximum supported length of the model. By naturally learning to scale up test-time computational complexity during reinforcement learning, o1-like models outperform state-of-the-art models when scaling up. In contrast, simple test-time scaling gradually lowers the upper bound on model performance when scaling down. While it is easy to replicate the test-time scaling behavior of the o1 model by scaling down, it is important to recognize that the goal of test-time computation scaling is to achieve higher performance than what the model was originally capable of, not simply to reproduce the appearance of the scaling behavior.