This paper studies the convergence speed of Linear TD($\lambda$), one of the most fundamental reinforcement learning algorithms for policy evaluation. Previous studies have assumed linear independence of features, but this assumption often does not hold in real-world settings. This paper presents the first $L^2$ convergence speed of Linear TD($\lambda$) using arbitrary features without any algorithm modifications or additional assumptions. This result applies to both discounted reward and average reward settings. Furthermore, to address the non-uniqueness of solutions that can arise due to arbitrary features, we develop a novel probabilistic approximation that characterizes the convergence speed to a set of solutions rather than a single point.