Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Symmetric Behavior Regularization via Taylor Expansion of Symmetry

Created by
  • Haebom

Author

Lingwei Zhu, Zheng Chen, Han Wang, Yukie Nagai

Outline

This paper presents a novel offline reinforcement learning framework by introducing symmetric divergence to behavioral regulation policy optimization (BRPO). Existing methods have focused on asymmetric divergence, such as KL, to obtain analytic regularization policies and practical minimization objectives. This paper shows that symmetric divergence does not allow for analytic regularization policies as a regularization strategy and can lead to numerical problems as a loss. To address these problems, we utilize the Taylor series of $f$-divergence. Specifically, we demonstrate that analytic policies can be obtained through a finite series. For the loss, symmetric divergence can be decomposed into an asymmetric term and a conditionally symmetric term, and the latter is Taylor-expanded to alleviate the numerical problems. Consequently, we propose Symmetric $f$ Actor-Critic (S$f$-AC), the first practical BRPO algorithm utilizing symmetric divergence. Distributional approximation and MuJoCo experimental results confirm that S$f$-AC achieves competitive performance.

Takeaways, Limitations

Takeaways: A novel offline reinforcement learning algorithm, S$f$-AC, is proposed, leveraging symmetric divergence. It overcomes the limitations of the existing BRPO algorithm and demonstrates competitive performance. A numerical solution using the Taylor series of $f$-divergence is also presented.
Limitations: Further experiments are needed to evaluate the generalization performance of the proposed method. Further performance evaluations are needed across a variety of environments and tasks. There is no clear guidance on choosing the order of the Taylor series.
👍