Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Improving the Robustness of Distantly-Supervised Named Entity Recognition via Uncertainty-Aware Teacher Learning and Student-Student Collaborative Learning

Created by
  • Haebom

Author

Shuzheng Si, Helan Hu, Haozhe Zhao, Shuang Zeng, Kaikai An, Zefan Cai, Baobao Chang

Outline

Remotely supervised named entity recognition (DS-NER) is widely used in real-world scenarios, but it suffers from the problem of label noise. Existing methods based on the teacher-student framework have limitations in that they generate incorrect pseudo-labeled samples due to the low reliability of the teacher network, which leads to error propagation. To address these issues, in this paper, we propose (1) uncertainty-aware teacher learning to reduce the number of incorrect pseudo-labels by exploiting prediction uncertainty, and (2) student-student collaborative learning to reduce the reliance on pseudo-labels and fully explore the mislabeled samples through reliable label propagation between the two student networks. The proposed method outperforms the state-of-the-art DS-NER methods on five DS-NER datasets.

Takeaways, Limitations

Takeaways: We propose a novel method to effectively solve the label noise problem in remotely supervised named entity recognition (DS-NER), and achieve state-of-the-art performance. Through uncertainty-aware teacher learning and student-student collaborative learning, we alleviate the reliability problem of the teacher network, and generate more accurate pseudo labels, leading to improved performance.
Limitations: The effectiveness of the proposed method may vary depending on the dataset used. Additional experiments on different types of datasets are needed, and the computational cost may increase. Additional research on the generalization performance for specific domains or languages is needed.
👍