Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

SPARC: Soft Probabilistic Adaptive multi-interest Retrieval Model via Codebooks for recommender system

Created by
  • Haebom

Author

Jialiang Shi, Yaguang Dou, Tian Qi

Outline

This paper proposes a novel retrieval framework, SPARC (Soft Probabilistic Adaptive Retrieval Model via Codebooks), to address the core challenges of multi-interest modeling in real-world recommender systems (RS). To overcome the fixed interests and passive matching strategies of existing methods, we utilize the Residual Quantized Variational Autoencoder (RQ-VAE) to construct a discrete interest space that dynamically evolves based on user behavior, and introduce a probabilistic interest module that predicts the probability distribution of the entire interest space. This shifts the paradigm from "passive matching" to "active exploration" in online inference, effectively enhancing the discovery of new interests. A/B testing on an industry platform with tens of millions of daily active users showed results such as +0.9% increase in user watch time, +0.4% increase in page views, and +22.7% increase in PV500 (new content reaching 500 PV within 24 hours). In offline evaluation using the Amazon Product dataset, we also confirmed improvements in Recall@K and NDCG@K metrics.

Takeaways, Limitations

Takeaways:
Demonstrates the possibility of dynamically changing multi-interest modeling based on user behavior.
Effectively promotes the discovery of new interests by shifting from the traditional passive matching method to an active exploration method.
Validation of effectiveness and practicality through experiments on industrial platforms and open-source datasets.
Proving that aspects that contribute to improving recommendation system performance are improved through actual business metrics.
Limitations:
Complexity of the interest space construction and learning process using RQ-VAE.
Because these results are from a specific industry platform and dataset, further research is needed to determine their generalizability to other domains or datasets.
There is room for further analysis and improvement of the performance of the probabilistic interest module.
👍