Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Finite Sample Analysis of Linear Temporal Difference Learning with Arbitrary Features

Created by
  • Haebom

Author

Zixuan Xie, Xinyu Liu, Rohan Chandra, Shangtong Zhang

Outline

Linear TD($\lambda$) is one of the fundamental reinforcement learning algorithms for policy evaluation. While the convergence rate has traditionally been determined by assuming linear independence of features, this assumption often fails in many real-world situations. This paper establishes the first $L^2$ convergence rate for Linear TD($\lambda$) operating on arbitrary features without modifying the algorithm or making additional assumptions. This holds true for both discounting and average reward settings. Furthermore, to address the non-uniformity of potential solutions due to arbitrary features, we develop a novel probabilistic approximation that characterizes the convergence rate to a set of solutions rather than a single point.

Takeaways, Limitations

Initial setting of $L^2$ convergence speed for linear TD($\lambda$) using arbitrary features
Applies to both discount and average compensation settings
A new probabilistic approximation to solve the problem of non-unity of the solution is presented.
No algorithm modifications or additional assumptions required
(Information regarding Limitations is missing from the summary of the paper)
👍