[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Hierarchical Reinforcement Learning for Temporal Abstraction of Listwise Recommendation

Created by
  • Haebom

Author

Luo Ji, Gao Liu, Mingyang Yin, Hongxia Yang, and Jingren Zhou

Outline

This paper addresses the need for modern list-based recommender systems to consider both long-term user perception and short-term attention changes, and proposes a novel framework, mccHRL, which leverages hierarchical reinforcement learning. mccHRL provides multiple levels of temporal abstraction for the list-based recommendation problem. The upper agent studies the evolution of user perception, and the lower agent generates an item selection policy by modeling it as a sequential decision problem. We claim that this framework provides a clear decomposition of the inter-session context (upper agent) and the intra-session context (lower agent), and demonstrates improved performance over several baseline models through experiments in a simulator-based environment and on industrial datasets. The data and code are open source.

Takeaways, Limitations

Takeaways:
We present a novel list-based recommendation framework that effectively models long-term and short-term user preferences using hierarchical reinforcement learning.
Increase modeling efficiency by clearly separating between-session and within-session contexts.
Experimental results demonstrate improved performance compared to existing methods.
Support reproducibility and follow-up research through data and code disclosure.
Limitations:
Further research is needed to investigate the generality of the proposed framework and its applicability to various recommender systems.
Further analysis is needed to understand the performance differences due to differences between simulator-based and real-world environments.
Further validation is needed on the generalizability of modeling to more complex and diverse user behavior patterns.
👍