Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Equip Pre-ranking with Target Attention by Residual Quantization

Created by
  • Haebom

Author

Yutong Li, Yu Zhu, Yichen Qiao, Ziyu Guan, Lv Shao, Tong Liu, Bo Zheng

Outline

The pre-ranking stage of industrial recommender systems faces a fundamental conflict between efficiency and effectiveness. While powerful models like Target Attention (TA) excel at capturing complex feature interactions in the ranking stage, their high computational cost renders them unsuitable for pre-ranking, which relies on simple vector product models. This discrepancy creates a performance bottleneck for the overall system. To bridge this gap, this paper proposes TARQ, a novel pre-ranking framework. Inspired by generative models, TARQ's core innovation is to apply an architecture approximating TA to pre-ranking through residual quantization. This allows us to apply TA's modeling performance to the latency-critical pre-ranking stage for the first time, establishing a new state-of-the-art trade-off between accuracy and efficiency. Extensive offline experiments and large-scale online A/B testing on Taobao demonstrate TARQ's significant improvements in ranking performance. Consequently, our model has been fully deployed in production, serving tens of millions of daily active users and resulting in significant business improvements.

Takeaways, Limitations

Takeaways:
Applying TA's powerful modeling capabilities to the pre-ranking stage, we present a new state-of-the-art compromise between accuracy and efficiency.
Validate performance improvements in real-world environments through large-scale online A/B testing, achieving significant business improvements.
Presenting real-world deployment success stories serving tens of millions of active users.
Limitations:
Further research is needed on the generalization performance and limitations of methods for approximating TA via residual quantization.
Further experiments are needed to determine applicability and generalization performance across various recommendation system environments.
This model is optimized for a specific business environment (Taobao), so its performance in other environments requires further evaluation.
👍