[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Self-supervised learning on gene expression data

Created by
  • Haebom

Author

Kevin Dradjat, Massinissa Hamidi, Pierre Bartet, Blaise Hanczar

Outline

This paper presents a study on the application of self-supervised learning to the problem of predicting phenotypes from gene expression data. Existing supervised learning-based machine learning and deep learning methods require a large amount of labeled data, but obtaining such data is costly and time-consuming in the case of gene expression data. To overcome these limitations, this study selected three state-of-the-art self-supervised learning methods and applied them to bulk gene expression data, and evaluated whether they improve the accuracy of phenotype prediction. Using several public gene expression datasets, we demonstrate that self-supervised learning methods can effectively capture complex information and improve prediction accuracy. We analyze the strengths and limitations of each method, make recommendations for method selection according to application cases, and suggest future research directions. This is the first study to combine bulk RNA-Seq data with self-supervised learning.

Takeaways, Limitations

Takeaways:
We demonstrate that self-supervised learning can reduce labeled data dependence and improve the accuracy of phenotype prediction based on gene expression data.
This is the first study to apply self-supervised learning methods to bulk RNA-Seq data analysis, suggesting new research directions.
Through comparative analysis of the performance of various self-supervised learning methods, the strengths and limitations of each method are clearly presented, and guidelines are provided for selecting the optimal method for each case.
Limitations:
The types of self-supervised learning methods used in the study are limited (only three methods were used).
Additional evaluation of generalization performance on various types of gene expression datasets is needed.
Further research is needed on the interpretability of self-supervised learning models.
👍