In this paper, we propose WaveHiT-SR, a novel image super-resolution (SR) method that integrates the wavelet transform within a hierarchical transformer framework. To overcome the limited receptive range of existing transformer-based SR methods, we employ adaptive hierarchical windows instead of fixed, small windows to capture features at various levels and enhance the ability to model long-range dependencies. Furthermore, we utilize the wavelet transform to decompose images into multiple frequency bands, preserving structural details while focusing on both global and local features. Hierarchical processing allows for progressive reconstruction of high-resolution images, reducing computational complexity while minimizing performance degradation. We demonstrate the effectiveness and efficiency of WaveHiT-SR through extensive experiments, and demonstrate that improved versions of SwinIR-Light, SwinIR-NG, and SRFormer-Light achieve higher efficiency (fewer parameters, fewer FLOPs, and faster speed) and state-of-the-art SR results.