This paper proposes TaylorSeer to address the high computational cost of the Diffusion Transformer (DiT), which excels at high-resolution image and video synthesis. Existing feature caching methods suffer from increased error due to decreased feature similarity at large time intervals. TaylorSeer overcomes this limitation by predicting features at future time steps based on feature values from previous time steps. It leverages the slow and continuous change of features across time steps to approximate higher-order derivatives through Taylor series expansion and predict future features. Experimental results demonstrate that TaylorSeer achieves high speed-up ratios in image and video synthesis, achieving 4.99x and 5.00x accelerations with virtually no loss in performance on FLUX and HunyuanVideo, respectively. In DiT, it achieves 4.53x acceleration while reducing FID by 3.41x compared to the previous state-of-the-art performance.