Linear TD($\lambda$) is one of the most fundamental reinforcement learning algorithms for policy evaluation. While the convergence rate is traditionally determined by assuming linear independence of features, this assumption is not valid in many real-world scenarios. In this paper, we establish the initial $L^2$ convergence rate for Linear TD($\lambda$) operating under arbitrary features without modifying the algorithm or making additional assumptions. These results apply to both discounting and average reward settings. To address the non-uniformity of potential solutions due to arbitrary features, we develop a novel probabilistic approximation that characterizes the convergence rate for a set of solutions rather than a single point.