Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Multi-Timescale Hierarchical Reinforcement Learning for Unified Behavior and Control of Autonomous Driving

Created by
  • Haebom

Author

Guizhe Jin, Zhuoren Li, Bo Leng, Ran Yu, Lu Xiong, Chen Sun

Outline

This paper proposes a multi-timescale hierarchical reinforcement learning (RL) approach to address the shortcomings of policy structure design in autonomous driving (AD). Existing RL-based AD methods often result in instability or underoptimization of driving behavior due to policies that output only short-term vehicle control commands or long-term driving objectives. In this study, we propose a hierarchical policy structure that integrates high-level and low-level policies to generate long-term driving guidance and short-term control commands, respectively. High-level policies explicitly express driving guidance as hybrid actions that capture multi-modal driving behavior and support state updates of low-level policies. Furthermore, we design a multi-timescale safety mechanism to ensure safety. Evaluation results on a multi-lane highway scenario, both simulator-based and using the HighD dataset, demonstrate that the proposed approach effectively improves driving efficiency, behavior consistency, and safety.

Takeaways, Limitations

Takeaways:
We demonstrate that multi-timescale hierarchical reinforcement learning can improve the stability and efficiency of autonomous driving.
This suggests that high-level policy representations using hybrid actions can effectively capture multi-modal driving behavior.
We demonstrate that safety over multiple time scales can be achieved through a hierarchical safety mechanism.
Limitations:
There is a lack of performance verification of the proposed method in real road environments.
The high reliance on simulators and HighD datasets necessitates further research on generalization performance.
The complexity of the hierarchical structure may increase training time and computational cost.
👍