[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Which Experiences Are Influential for RL Agents? Efficiently Estimating The Influence of Experiences

Created by
  • Haebom

Author

Takuya Hiraoka, Guanquan Wang, Takashi Onishi, Yoshimasa Tsuruoka

Outline

In this paper, we present Policy Iteration with Turn-over Dropout (PIToD), a novel method for efficiently estimating the influence of experiences on the performance of reinforcement learning (RL) agents using experience replay. PIToD efficiently addresses the computational cost of the traditional leave-one-out (LOO) method. We evaluate how accurately PIToD estimates the influence of experiences and how much more efficient it is than LOO. We also demonstrate that PIToD can improve the performance of low-performing RL agents by identifying experiences with negative influence and removing the influence of these experiences.

Takeaways, Limitations

Takeaways:
We present a novel method for efficiently estimating the influence of experience in experience-based reinforcement learning (PIToD).
We experimentally demonstrate that it is possible to improve the performance of low-performance RL agents by leveraging PIToD.
Effectively solve the computational cost problem of LOO method.
Limitations:
The performance and efficiency of PIToD have been evaluated for specific RL environments and agents, and its generalizability to other environments or agents requires further study.
Lack of comparative analysis with other improvement strategies other than those that eliminate experiences that have negative impacts.
Further research may be needed on the scalability of PIToD to large datasets.
👍