This paper presents MIRROR, a novel method for multimodal self-supervised learning of histopathology and transcriptomics in cancer research. While existing multimodal integration methods focus on modal alignment, MIRROR simultaneously performs modal alignment while maintaining modal-specific structure by considering the heterogeneity of histopathology and transcriptomics. It builds a comprehensive cancer feature representation using a dedicated encoder that extracts features for each modality, a modal alignment module, a modal maintenance module, and a style clustering module. Experimental results using the TCGA cohort demonstrate excellent performance in cancer subtype classification and survival analysis.