In this paper, we present a foundational model, ViTa, that integrates cardiac magnetic resonance imaging (CMR) and patient-level health factors to enable a comprehensive understanding of cardiac health and an accurate interpretation of individual disease risk. Using data from 42,000 UK Biobank participants, we integrate 3D+T cine stacks of short- and long-axis images with detailed tabular patient-level factors. This multimodal paradigm supports multiple subtasks, including cardiac phenotype and physiological feature prediction, segmentation, and cardiac and metabolic disease classification, within a single, unified framework. By learning a shared latent representation that connects rich image features with patient context, we demonstrate the potential to go beyond existing task-specific models to provide a general, patient-specific understanding of cardiac health, enhancing clinical utility and scalability of cardiac analytics.