To address the challenges of developing artificial intelligence (AI) algorithms that play a key role in cutting-edge digital health technologies for diabetes management, this paper presents the Glucose-ML collection, which includes 10 publicly available diabetes datasets published from 2018 to 2025. Glucose-ML contains over 3 million days of continuous glucose monitoring (CGM) data (38 million blood glucose samples in total) from over 2,500 type 1 diabetes, type 2 diabetes, pre-diabetes, and non-diabetic patients from four countries. To help researchers effectively utilize this dataset, we provide a comparative analysis of the datasets and a case study centered on the AI task of blood glucose prediction. Through the case study, we demonstrate that prediction results can vary significantly depending on the dataset, even for the same algorithm, and provide recommendations for developing robust AI solutions based on this. We provide links and code for all datasets.