Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

RapidGNN: Energy and Communication-Efficient Distributed Training on Large-Scale Graph Neural Networks

Created by
  • Haebom

Author

Arefin Niam, Tevfik Kosar, MSQ Zulkar Nine

Outline

This paper proposes RapidGNN, a novel framework for improving the efficiency of distributed learning of Graph Neural Networks (GNNs) on large-scale graphs. While existing sampling-based approaches reduce computational load, communication overhead remains a problem. RapidGNN enables efficient cache construction and remote feature prefetching through deterministic sampling-based scheduling. Evaluation results on benchmark graph datasets show that RapidGNN improves end-to-end learning throughput by an average of 2.46x to 3.00x compared to existing methods, and reduces remote feature fetching by 9.70x to 15.39x. Furthermore, it achieves near-linear scalability with increasing compute units, and improves energy efficiency by 44% and 32%, respectively, compared to existing methods on both CPUs and GPUs.

Takeaways, Limitations

Takeaways:
We present RapidGNN, a novel framework that significantly improves the distributed learning efficiency of GNNs on large-scale graphs.
Significantly improved throughput and energy efficiency compared to existing methods.
It exhibits almost linear scalability.
Empirically demonstrating the effectiveness of deterministic sampling-based scheduling.
Limitations:
Lack of specific details about the type and size of the proposed benchmark dataset.
RapidGNN's performance improvements may depend on specific hardware environments.
Further comparative analysis with other distributed learning frameworks is needed.
Further research is needed on the applicability and generalization performance of various GNN models.
👍