To address the limitations of vision-language models (VLMs) that enable natural language interaction with satellite imagery, this paper presents Landsat30-AU, a large-scale vision-language dataset based on over 36 years of long-term, low-resolution satellite imagery at 30 meters collected from four Landsat satellites (5, 7, 8, and 9) over Australia. Landsat30-AU consists of two components: Landsat30-AU-Cap, containing 196,262 image-caption pairs, and Landsat30-AU-VQA, containing 17,725 human-verified visual question answering (VQA) samples across eight remote sensing domains. We demonstrate that existing VLMs struggle to understand low-resolution satellite imagery and demonstrate improved performance through lightweight fine-tuning using Landsat30-AU.