This paper presents a novel integrated framework for dynamic origin-destination demand estimation (DODE) in multi-class mesoscale network models, leveraging high-resolution satellite imagery and existing local sensor traffic data. Unlike sparse local detectors, satellite imagery provides consistent, city-wide road and traffic information for both parked and moving vehicles, thereby overcoming data availability limitations. To extract information from imagery data, we design a computer vision pipeline for class-specific vehicle detection and map matching to generate link-level traffic density observations for each vehicle class. Based on this information, we formulate a computational graph-based DODE model that jointly matches observed traffic volumes and local sensor travel times with density measurements derived from satellite imagery to compensate for dynamic network conditions. To evaluate the accuracy and scalability of the proposed framework, we perform a series of numerical experiments using synthetic and real data. Out-of-sample test results show that augmenting satellite-derived density with existing data significantly improves estimation performance, especially for links without local sensors. Real-world experiments also demonstrate the framework’s ability to handle large-scale networks, supporting its feasibility for practical deployment in cities of various sizes. Sensitivity analysis further evaluates the impact of data quality on satellite imagery data.