Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Landsat30-AU: A Vision-Language Dataset for Australian Landsat Imagery

Created by
  • Haebom

Author

Sai Ma, Zhuang Li, John A Taylor

Outline

To address the limitations of vision-language models (VLMs) that enable natural language interaction with satellite imagery, this paper presents Landsat30-AU, a large-scale vision-language dataset based on over 36 years of long-term, low-resolution satellite imagery at 30 meters collected from four Landsat satellites (5, 7, 8, and 9) over Australia. Landsat30-AU consists of two components: Landsat30-AU-Cap, containing 196,262 image-caption pairs, and Landsat30-AU-VQA, containing 17,725 human-verified visual question answering (VQA) samples across eight remote sensing domains. We demonstrate that existing VLMs struggle to understand low-resolution satellite imagery and demonstrate improved performance through lightweight fine-tuning using Landsat30-AU.

Takeaways, Limitations

Takeaways:
We provide a large-scale vision-language dataset, Landsat30-AU, containing long-term, low-resolution, multi-satellite data, laying the foundation for overcoming the limitations of existing VLMs.
We experimentally demonstrated the inadequacy of existing VLMs in satellite image understanding and suggested the possibility of performance improvement through fine-tuning.
It opens up new possibilities for Earth observation and analysis research based on low-resolution satellite imagery.
Limitations:
Because the dataset is limited to the Australian region, further validation of global generalization performance is needed.
Currently, VLMs' ability to understand satellite imagery is still limited, and more advanced models and techniques are needed.
There is a lack of detailed description of the bootstrapping pipeline used during dataset creation.
👍