Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Towards Text-free Graph Foundation Models: Rethinking Multi-Domain Graph Contrastive Learning

Created by
  • Haebom

Author

Zihao Zhao, Xinlong Zhai, Jinyu Yang, Chuan Shi

Outline

In this paper, we propose MDGCL, a novel multi-domain pre-learning and cross-domain transfer framework that builds graph-based foundation models by leveraging graph data from various domains. To overcome the limitations of conventional single-domain-centric contrastive learning strategies, we introduce a contrastive learning strategy that recognizes and captures differences between domains and domain tokens that encode domain-level global information. In downstream tasks, we enable fine-grained domain knowledge transfer through a domain attention mechanism. Experimental results on five benchmark datasets demonstrate that the proposed method outperforms state-of-the-art techniques by up to 19.33% in accuracy and up to 19.13% in Macro-F1 score.

Takeaways, Limitations

Takeaways:
We present a new framework that overcomes the limitations of existing single-domain-centered graph-based basic model learning methods and effectively utilizes multi-domain graph data.
We propose a contrastive learning strategy and a domain attention mechanism that integrate knowledge from various domains by effectively considering differences between domains.
Achieves state-of-the-art performance on a variety of graph-based tasks, demonstrating substantial performance improvements.
Limitations:
The effectiveness of the proposed method is based on experimental results on specific benchmark datasets, and generalization performance to other types of graph data or tasks requires further study.
The design of domain tokens and domain attention mechanisms is limited to a specific method, and there is a lack of comparative analysis on the effectiveness of other design methods.
There may still be challenges in obtaining and preprocessing graph data from various domains.
👍