This paper proposes AnomalyControl, a novel anomaly synthesis framework, to overcome the shortcomings of existing text-to-image anomaly synthesis methods. Existing methods rely solely on textual information or coarsely aligned visual features, failing to adequately capture the complex characteristics of anomalies. AnomalyControl utilizes cross-modal semantic features as guidance signals, encoding generalized anomalies from text-to-image reference prompts. Specifically, it utilizes mismatched prompt pairs (text-to-image reference prompts and target text prompts) and leverages the cross-modal semantic modeling (CSM) module and the anomaly-to-semantic-enhanced attention (ASEA) mechanism to focus on subtle visual patterns of anomalies, enhancing the realism and contextual relevance of the generated anomaly features. Finally, the semantic map adapter (SGA) utilizes cross-modal semantic features as prior information to encode effective guidance signals for an appropriate and controllable synthesis process. Experimental results demonstrate that AnomalyControl outperforms existing methods and achieves state-of-the-art results in anomaly synthesis and subsequent tasks.