Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Combating Confirmation Bias: A Unified Pseudo-Labeling Framework for Entity Alignment

Created by
  • Haebom

Author

Qijie Ding, Jie Yin, Daokun Zhang, Junbin Gao

Outline

This paper addresses the problem of entity alignment (EA), which identifies identical real-world entities between knowledge graphs. Existing EA models utilize a pseudo-labeling strategy that repeatedly adds highly reliable predicted unaligned entity pairs to the initial alignment data to address the lack of initial alignment data for learning. However, this process overlooks the negative impact of confirmation bias, which leads to performance degradation. In this paper, we propose UPL-EA, an integrated pseudo-labeling framework that systematically addresses confirmation bias in pseudo-labeling-based EA by explicitly removing pseudo-labeling errors and improving entity alignment accuracy. UPL-EA consists of two components: (1) Optimal Transport (OT)-based pseudo-labeling reduces false matching between two knowledge graphs and effectively determines entity correspondences using discrete OT modeling. It also presents an effective criterion for inferring pseudo-labeling alignments that satisfy one-to-one correspondences. (2) Parallel pseudo-labeling ensemble improves pseudo-labeling alignment by combining predictions from multiple models trained independently in parallel. The improved pseudo-labeling alignment is then added to the initial alignment data to enhance model learning for subsequent alignment inference. The effectiveness of UPL-EA in removing pseudo-labeling errors has been verified theoretically and experimentally. Through extensive experimental results and in-depth analysis, we demonstrate that UPL-EA outperforms 15 competing baseline models and is useful as a general pseudo-labeling framework.

Takeaways, Limitations

Takeaways:
We present a novel framework UPL-EA that effectively addresses the confirmation bias problem in pseudo-label-based object alignment.
Improving alignment accuracy with OT-based pseudo-labeling and parallel pseudo-labeling ensembles.
Verification of the superiority and generality of UPL-EA through various experiments.
Presenting a criterion for inference of pseudo-labeling alignment satisfying one-to-one correspondence.
Limitations:
The performance improvement of UPL-EA may be limited to specific datasets and models. Additional experiments on various knowledge graphs and data characteristics are needed.
OT-based pseudo-labeling can be computationally expensive. Research is needed to improve computational efficiency.
There may be a lack of consideration for other types of errors (e.g., noisy data) other than confirmation bias. There is a need to develop robust models for various types of errors.
👍