In this paper, we propose Outlier-Safe Pre-Training (OSP), a proactive approach rather than a post-mitigation approach to address the problem of extreme activation outliers that degrade the quantization performance of large-scale language models (LLMs). OSP combines three key innovations: the Muon optimizer, the Single-Scale RMSNorm, and the learnable embedding projection to proactively prevent outlier generation. We train a 1.4 billion-parameter model with 1 trillion tokens, and achieve an average score of 35.7 on ten benchmarks (vs. 26.5 for the Adam-trained model) under aggressive 4-bit quantization, with only 2% training overhead. This demonstrates that outliers in LLMs are artifacts of the training strategy and not inherent properties. The source code and pre-trained checkpoints are available on GitHub.