This paper focuses on the fact that while improving the quality and size of pre-training data is known to improve downstream performance, the impact of text complexity (reading difficulty) has been relatively less studied. By reducing surface complexity—that is, using shorter sentences, easier words, and simpler structures while maintaining a largely consistent core content—we studied (i) how text complexity affects various model sizes, (ii) whether useful representations can be learned from simple text alone, and (iii) how pre-training text complexity affects downstream language understanding. To achieve this, we used a large-scale language model to simplify human-written texts. Causal models (28M-500M) were pre-trained from scratch using both the original and simplified data, and then fine-tuned and evaluated under zero-shot settings.