Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Fourier-Guided Attention Upsampling for Image Super-Resolution

Created by
  • Haebom

Author

Daejun Choi, Youchan No, Jinhyung Lee, Duksu Kim

Outline

This paper proposes Frequency-Guided Attention (FGA), a lightweight upsampling module for single-image super-resolution. Conventional upsamplers, such as subpixel convolution, while efficient, often fail to reconstruct high-frequency details and introduce aliasing artifacts. FGA addresses these issues by incorporating (1) a Fourier feature-based multilayer perceptron (MLP) for positional frequency encoding, (2) a cross-resolution correlated attention layer for adaptive spatial alignment, and (3) a frequency-domain L1 loss for spectral fidelity supervision. With only 0.3M additional parameters, FGA consistently improves performance across five different super-resolution backbones in both lightweight and full-capacity scenarios. Experimental results demonstrate average PSNR gains of 0.12–0.14 dB and up to 29% improvement in frequency-domain coherence, particularly on texture-rich datasets. Visual and spectral evaluations confirm that FGA is effective in reducing aliasing and preserving fine details, demonstrating that it is a practical and scalable alternative to existing upsampling methods.

Takeaways, Limitations

Takeaways:
Effectively solves the high-frequency detail reconstruction failure and aliasing artifact problems of conventional upsamplers through a lightweight FGA module.
Consistent performance improvements (average PSNR increase of 0.12–0.14 dB) and frequency-domain coherence improvements (up to 29%) across various super-resolution backbones.
Lightweight and scalable by adding only a small number of parameters, 0.3M.
Visual and spectral evaluations confirmed that it is particularly effective in images with rich texture.
Limitations:
The paper lacks specific references to Limitations or future research directions.
Further analysis is needed to determine whether there is a dependency on a specific dataset or backbone.
A broader evaluation of the generalization performance of the proposed method is needed.
👍