Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

MHSNet:An MoE-based Hierarchical Semantic Representation Network for Accurate Duplicate Resume Detection with Large Language Model

Created by
  • Haebom

Author

Yu Li, Zulong Chen, Wenjian Xu, Hong Wen, Yipeng Yu, Man Lung Yiu, Yuyu Yin

Outline

This paper proposes MHSNet, a novel framework for detecting duplicates in resumes collected from third-party websites to maintain a company's talent pool. MHSNet fine-tunes BGE-M3 using contrastive learning and uses Mixture-of-Experts (MoE) to generate multi-layered (sparse and dense) representations of resumes to compute semantic similarity. A notable feature is its use of state-aware MoE to handle a variety of incomplete resumes. Experimental results demonstrate the effectiveness of MHSNet.

Takeaways, Limitations

Takeaways:
It can contribute to improving the quality of third-party resumes and expanding the company's talent pool.
We present an effective duplicate detection method for incomplete and heterogeneous resume data.
We present a novel approach to generating multi-layered semantic representations by combining contrastive learning and MoE.
Limitations:
The performance evaluation of the proposed MHSNet may be limited to a specific dataset. Additional experiments on diverse datasets are needed.
Further research is needed on applicability and scalability in real-world business environments.
Because of the high dependence on BGE-M3, analysis of performance changes when using other base models is necessary.
👍