[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Glucose-ML: A collection of longitudinal diabetes datasets for development of robust AI solutions

Created by
  • Haebom

Author

Temiloluwa Prioleau, Baiying Lu, Yanjun Cui

Outline

To address the challenges of developing artificial intelligence (AI) algorithms that play a key role in cutting-edge digital health technologies for diabetes management, this paper presents the Glucose-ML collection, which includes 10 publicly available diabetes datasets published from 2018 to 2025. Glucose-ML contains over 3 million days of continuous glucose monitoring (CGM) data (38 million blood glucose samples in total) from over 2,500 type 1 diabetes, type 2 diabetes, pre-diabetes, and non-diabetic patients from four countries. To help researchers effectively utilize this dataset, we provide a comparative analysis of the datasets and a case study centered on the AI task of blood glucose prediction. Through the case study, we demonstrate that prediction results can vary significantly depending on the dataset, even for the same algorithm, and provide recommendations for developing robust AI solutions based on this. We provide links and code for all datasets.

Takeaways, Limitations

Takeaways:
Accelerate the development of AI-based diabetes management technologies by providing 10 public diabetes datasets.
Supports data selection for algorithm developers through comparative analysis of datasets.
A case study on blood sugar prediction demonstrates the differences in algorithm performance across different datasets and provides guidance for developing robust AI models.
Increase reproducibility and transparency of research by making datasets and code open.
Limitations:
There may be issues with the dataset, such as qualitative deviation and sampling bias.
The case study focuses only on blood sugar prediction, limiting its generalizability to other tasks in diabetes management AI.
There may be a lack of detailed information about the diversity of the dataset (race, age, gender, etc.).
Further research may be needed with longer-term data tracking and analysis of the results.
👍