Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Improved Personalized Headline Generation via Denoising Fake Interests from Implicit Feedback

Created by
  • Haebom

Author

Kejin Liu, Junhong Lian, Xiang Ao, Ningtao Wang, Xing Fu, Yu Cheng, Weiqiang Wang, Xinyu Liu

Outline

This paper highlights the problem that existing methods for generating personalized headlines based on users' past click data fail to account for irrelevant click noise in the clickstream, potentially generating headlines that do not match users' actual preferences. To address this issue, we propose a novel framework, PHG-DIF (Personalized Headline Generation framework via Denoising Fake Interests from Implicit Feedback). PHG-DIF removes clickstream noise through double filtering based on short dwell times and unusual click bursts, and dynamically models users' evolving and multifaceted interests through multi-level temporal fusion to achieve accurate user profiling. Furthermore, we present a new benchmark dataset, DT-PENS, consisting of click data from 1,000 users and approximately 10,000 annotated personalized headlines. Experimental results demonstrate that PHG-DIF significantly mitigates the negative impact of click noise and achieves state-of-the-art performance.

Takeaways, Limitations

Takeaways:
We identify the negative impact of clickstream noise on personalized headline generation and suggest effective ways to mitigate it.
We propose PHG-DIF, a novel framework that effectively models users' multifaceted and evolving interests.
We are releasing DT-PENS, a new benchmark dataset for personalized headline generation research.
PHG-DIF demonstrates improved performance over existing methods and achieves state-of-the-art performance.
Limitations:
The DT-PENS dataset may be relatively small (1,000 users and approximately 10,000 headlines).
Further research may be needed to determine the generality and applicability of click noise removal criteria (short dwell times, unusual click bursts).
Additional validation of generalization performance for different types of click noise may be required.
👍