Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Enhancing Generative Auto-bidding with Offline Reward Evaluation and Policy Search

Created by
  • Haebom

Author

Zhiyu Mou, Yiqin Lv, Miao Xu, Qi Wang, Yixiu Mao, Qichen Ye, Chao Li, Rongquan Bai, Chuan Yu, Jian Xu, Bo Zheng

Outline

Automated bidding is a crucial tool for improving advertiser advertising performance. AI-based bidding (AIGB), which learns a conditional generative planner from offline data, has demonstrated superior performance to existing offline reinforcement learning (RL)-based automated bidding methods. However, existing AIGB methods face performance bottlenecks due to limitations in static offline datasets. To address these issues, this paper proposes AIGB-Pearl (Planning with Evaluator via RL), a novel method that integrates generative planning and policy optimization. The core of AIGB-Pearl is to construct a trajectory estimator that evaluates generation quality and to design a provably KL-Lipschitz constraint score maximization method to ensure safe and efficient generalization beyond offline datasets. Furthermore, we develop a practical algorithm that integrates synchronous coupling techniques to ensure model regularity for the proposed method. Extensive experiments on simulations and real-world advertising systems demonstrate the state-of-the-art performance of the proposed method.

Takeaways, Limitations

Takeaways:
AIGB-Pearl integrates generative planning and policy optimization to overcome the limitations of static datasets and improve performance of existing AIGB methods.
The KL-Lipschitz constraint score maximization method ensures safe and efficient generalization beyond offline datasets.
The practicality of the proposed method was enhanced by securing model regularity through synchronous coupling technology.
State-of-the-art performance was demonstrated through simulations and real-world advertising system experiments.
Limitations:
There is no specific mention of Limitations in the paper (this may be revealed through further research).
👍