Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Flattening Hierarchies with Policy Bootstrapping

Created by
  • Haebom

Author

John L. Zhou, Jonathan C. Kao

Outline

This paper addresses an extension of offline goal-directed reinforcement learning (GCRL) to improve performance on long horizons. To address the challenges posed by sparse rewards and discounting, which hinder learning on long horizons, the authors propose an algorithm that learns a flat (non-hierarchical) goal-conditional policy based on sub-goal conditional policies. This algorithm eliminates the need for generative models for the sub-goal space, facilitating its extension to high-dimensional goal spaces. Furthermore, we demonstrate that existing hierarchical and bootstrapping-based approaches are appropriate for specific design choices in the proposed algorithm. We demonstrate that the proposed algorithm outperforms existing GCRL algorithms on various benchmarks, demonstrating its successful application to complex, long-horizon tasks.

Takeaways, Limitations

Improved scalability of offline GCRL algorithms to improve performance in long horizons and complex environments.
Scalability to high-dimensional target spaces is ensured by eliminating the need for generative models for sub-target spaces.
We highlight the generality of the algorithm by presenting its relationship to existing hierarchical and bootstrapping-based approaches.
Demonstrated performance that outperforms existing algorithms in various benchmarks.
(Limitations is not specified in the paper)
👍