Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

PhenoGnet: A Graph-Based Contrastive Learning Framework for Disease Similarity Prediction

Created by
  • Haebom

Author

Ranga Baminiwatte, Kazi Jewel Rana, Aaron J. Masino

Outline

PhenoGnet is a novel graph-based contrastive learning framework that integrates gene-functional interaction networks and the Human Phenotype Ontology (HPO) to predict disease similarity. It consists of an intra-view model that separately encodes gene and phenotype graphs using Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs), and a cross-view model implemented with a shared-weight multilayer perceptron (MLP) that aligns gene and phenotype embeddings through contrastive learning. The model is trained using known gene-phenotype associations as positive pairs and randomly sampled unrelated pairs as negative pairs. Diseases are represented by the average embedding of associated genes and/or phenotypes, and pairwise similarity is calculated using cosine similarity. When evaluated on a reference dataset consisting of 1,100 similar disease pairs and 866 dissimilar disease pairs, the gene-based embedding achieved an AUCPR of 0.9012 and an AUROC of 0.8764, outperforming existing state-of-the-art methods. Specifically, PhenoGnet provides a scalable and interpretable solution for disease similarity prediction by capturing potential biological relationships beyond direct overlap. These results highlight its potential for downstream applications in rare disease research and precision medicine.

Takeaways, Limitations

Takeaways:
Improved accuracy of disease similarity prediction through integration of gene functional interaction network and HPO (AUCPR 0.9012, AUROC 0.8764).
The ability to capture potential biological relationships beyond direct overlap.
Potential applications in rare disease research and precision medicine.
Scalable and interpretable models.
Limitations:
The paper does not specifically mention Limitations. Further experiments and analyses may be needed to further evaluate the model's generalization performance, its performance on specific disease types, and the interpretability of the embeddings. The size and bias of the dataset used should also be considered.
👍