Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

GMLM: Bridging Graph Neural Networks and Language Models for Heterophilic Node Classification

Created by
  • Haebom

Author

Aarush Sinha

Outline

In this paper, we present an efficient and effective integration of computationally expensive pre-trained language models (PLMs) and graph neural networks (GNNs) in text-rich heterogeneous graphs. We propose a framework called Graph Masked Language Model (GMLM), which consists of two stages: a contrastive pre-training stage using a soft masking technique and an end-to-end fine-tuning stage using a dynamic active node selection strategy and a bidirectional cross-attention module. Experimental results on five heterogeneous benchmarks show that GMLM achieves state-of-the-art performance on four benchmarks and significantly outperforms existing GNN and large-scale LLM-based methods. For example, it improves accuracy by more than 8% on the Texas dataset and by nearly 5% on the Wisconsin dataset. This study demonstrates that sophisticated and deeply integrated architectures can be more effective and efficient than larger, more general-purpose models for learning text-rich graph representations.

Takeaways, Limitations

Takeaways:
Presenting an efficient and effective integration method of PLM and GNN in text-rich heterogeneous graphs
Improved scalability and performance through soft masking and dynamic active node selection strategies
Achieving state-of-the-art performance on a variety of heterogeneous graph benchmarks
Demonstrating the excellence of deeply integrated architecture
Limitations:
Further research is needed on the generalization performance of the proposed method.
Robustness assessment for various graph structures and text types is needed.
More detailed analysis of computational costs is needed
👍