This paper presents the ViTa model, which integrates cardiac magnetic resonance imaging (CMR) with patient-level health factors to enable a comprehensive understanding of cardiac health and personalized disease risk interpretation. Leveraging data from 42,000 UK Biobank participants, we integrate 3D+T cine stack image data in short-axis and long-axis views with detailed tabular patient-level factors. This multimodal paradigm supports multiple subtasks, including cardiac phenotype and physiological feature prediction, segmentation, and cardiac and metabolic disease classification, within a single, integrated framework. By learning a shared latent representation that connects rich image features with patient context, we aim to provide patient-specific understanding of cardiac health beyond existing task-specific models.