[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Scalable Submodular Policy Optimization via Pruned Submodularity Graph

Created by
  • Haebom

Author

Aditi Anand, Suman Banerjee, Dildar Ali

Outline

This paper deals with the case where the reward function is a submodular function in Reinforcement Learning (RL). In conventional RL, the reward function is assumed to be additive, but in real problems such as path planning or adaptive control, it is more appropriate to model it as a submodular function with diminishing returns. In this paper, we propose a submodular graph-based pruning technique for RL problems with submodular reward functions. We prove that the technique finds an approximate optimal policy within a computable time, and analyze the time and space complexity and performance guarantee. Through experiments using a benchmark environment used in previous studies, we confirm that the proposed technique obtains higher rewards than existing methods.

Takeaways, Limitations

Takeaways: We present an efficient and approximate solution to RL problems with partial modular reward functions. We demonstrate the superiority of the proposed technique through experimental results showing that it obtains higher rewards than existing methods. We ensure practicality through time and space complexity analysis.
Limitations: The performance guarantee of the proposed technique is for approximate solutions, and does not guarantee optimal solutions. The experiments are limited to a specific benchmark environment, and generalization performance in other environments requires additional research. Applicability and performance analysis for various types of partial modular functions are additionally required.
👍