Text-to-image models demonstrate remarkable ability to generate high-quality images from natural language descriptions, but they are highly vulnerable to adversarial prompts that can bypass safety measures and generate malicious content. In this paper, we experimentally study the text encoder of the Stable Diffusion (SD) model and find that the [EOS] token acts as a semantic aggregate and exhibits distinct distribution patterns between legitimate and adversarial prompts. Building on this, we introduce SafeGuider, a two-stage framework for robust safety control without compromising generation quality. Combining an embedding-level awareness model and a safety-aware feature-suppressing beam search algorithm, SafeGuider maintains high-quality image generation for legitimate prompts while ensuring robust defense against both in-domain and out-of-domain attacks. SafeGuider achieves an attack success rate of up to 5.48% across various attack scenarios and enhances practicality by generating safe, meaningful images for unsafe prompts instead of rejecting them or generating black images. Furthermore, we demonstrate that SafeGuider can be effectively applied to other text-to-image models, such as the Flux model, in addition to the SD model.