This paper emphasizes the critical importance of accurately forecasting the impact of macroeconomic events for investors and policymakers, and points out the limitations of existing forecasting methods centered on text analysis or time series modeling. These methods suffer from a failure to adequately capture the diverse modes of financial markets and the causal relationships between events and price fluctuations. To address this, this paper proposes Causal-Augmented Multi-Modality Event-Driven Financial Forecasting (CAMEF), a multi-modal framework that integrates text and time series data with a causal learning mechanism and an LLM-based counterfactual event augmentation technique. CAMEF captures the causal relationship between policy text and historical price data and utilizes a novel financial dataset consisting of six macroeconomic indicator announcements and high-frequency real-time trading data for five major US financial assets from 2008 to April 2024. We improve forecasting performance through an LLM-based counterfactual event augmentation strategy, and verify the effectiveness of causal learning mechanisms and event types through comparative analysis and ablation studies with state-of-the-art transformer-based time series and multimodal baseline models.