Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Average-Reward Soft Actor-Critic

Created by
  • Haebom

Author

Jacob Adamczyk, Volodymyr Makarenko, Stas Tiomkin, Rahul V. Kulkarni

Outline

This paper addresses the recent growing interest in average-reward formulations for reinforcement learning (RL) that can solve long-term problems without discounting. In discounted settings, entropy-regulatory algorithms have been developed, demonstrating superior performance over deterministic methods. However, deep RL algorithms targeting entropy-regulatory average-reward objectives have not been developed. To address this gap, this paper proposes an average-reward soft actor-critic algorithm. We validate our method by comparing it with existing average-reward algorithms on standard RL benchmarks, achieving superior performance for the average-reward criterion.

Takeaways, Limitations

Takeaways: We present a novel deep reinforcement learning algorithm (mean-reward soft actor-critic) for entropy-regulation of the mean-reward objective, demonstrating the effectiveness of the mean-reward formulation by outperforming existing algorithms on standard RL benchmarks. We present a novel approach to solving the mean-reward problem using the actor-critic framework.
Limitations: The performance of the presented algorithm may be limited to a specific benchmark. Further research is needed to determine its generalization performance in a variety of environments. Analysis of the algorithm's computational cost and complexity is lacking.
👍