Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Towards Better Benchmark Datasets for Inductive Knowledge Graph Completion

Created by
  • Haebom

Author

Harry Shomer, Jay Revolinsky, Jiliang Tang

Outline

This paper points out that existing benchmark datasets for Inductive Knowledge Graph Completion (KGC) provide shortcuts that improve performance even if they ignore relational information. In particular, we find that the Personalized PageRank (PPR) score can achieve the best performance or close to it on most datasets. We analyze the root cause of this problem and propose an alternative inductive KGC dataset construction strategy that alleviates the shortcuts. Using the newly constructed dataset, we benchmark several popular methods and analyze their performance to increase our understanding of the capabilities and challenges of inductive KGC. The code and dataset can be found at https://github.com/HarryShomer/Better-Inductive-KGC .

Takeaways, Limitations

Takeaways: We contribute to the development of inductive KGC research by revealing the problems of existing inductive KGC datasets and suggesting a new dataset that removes shortcuts. The improved dataset can more accurately evaluate the actual performance of inductive KGC models.
Limitations: We cannot assume that the new dataset completely removes all shortcuts, and there is a possibility that new shortcuts exist. Further research is needed to determine whether the proposed dataset construction strategy is applicable to all inductive KGC problems. In addition, it may be necessary to discuss whether it is always desirable to completely remove shortcuts using PPR scores, since the existence of shortcuts using PPR scores can suggest new approaches to inductive KGC problems.
👍