In this paper, we propose a novel visual-linguistic causal intervention framework, ADPC (Alzheimer's Disease Prediction with Cross-modal Causal Intervention), to address the selection bias and confounding problems caused by complex relationships between variables in multimodal data, with the goal of early diagnosis of mild cognitive impairment (MCI) and delaying the progression to Alzheimer's disease (AD). ADPC uses a large-scale language model (LLM) to maintain structured text output even in incomplete or imbalanced datasets, and classifies cognitively normal (CN), MCI, and AD using MRI, fMRI images, and text data generated by the LLM. Causal intervention removes the influence of confounding variables (e.g., neuroimaging artifacts, age-related biomarkers) to obtain reliable results. Experimental results show that ADPC achieves state-of-the-art (SOTA) performance in most evaluation metrics, demonstrating excellent performance in distinguishing CN/MCI/AD cases. This study demonstrates the potential of integrating multimodal learning and causal inference for neurological disease diagnosis.