This paper focuses on improving the consistency of long-form video generation, especially the smoothness and transitions between scenes. To improve the consistency and cohesion in video generation using single or multiple prompts, we propose a time-frequency based temporal attention reweighting algorithm (TiARA) based on the Discrete Short-Time Fourier Transform (DSFT). TiARA improves the inter-frame consistency by editing the attention score matrix through frequency-based analysis. In addition, we identify important factors such as prompt alignment for videos generated with multiple prompts and propose PromptBlend, an advanced prompt interpolation pipeline that systematically aligns the prompts. Experimental results verify the effectiveness of the proposed method, showing consistent and significant performance improvements over several baseline models.