Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

Created by
  • Haebom

Author

Huaqiu Li, Yong Wang, Tongwen Huang, Hailang Huang, Haoqian Wang, Xiangxiang Chu

Outline

This paper presents a novel approach for integrated image restoration, a critical task in low-level vision. Existing methods are either task-specific or rely on paired datasets for training, resulting in poor generalization performance and closed-set constraints. To address these issues, we propose a dataset-free, integrated approach utilizing recursive posterior probability sampling with a pretrained latent diffusion model. The method integrates a multimodal understanding model to provide semantic prior information to the generative model under task-independent conditions, uses lightweight modules to align degraded inputs with the generative preferences of the diffusion model, and employs recursive refinement for posterior probability sampling. Extensive experiments demonstrate that the proposed method outperforms state-of-the-art methods, validating its effectiveness and robustness. Code and data are available at https://github.com/AMAP-ML/LD-RPS .

Takeaways, Limitations

Takeaways:
Solves the limitations of existing methods, such as custom design for specific tasks and dependency on paired datasets.
A unified image restoration approach without datasets is presented.
Improved performance by leveraging pre-trained latent diffusion models and multi-modal understanding models.
Improving restoration performance through recursive posterior probability sampling.
Improved robustness and generalization performance against various degradation types.
Limitations:
May depend on the performance of the pre-trained latent diffusion model.
The performance of a multi-modal understanding model can impact the overall system performance.
Potential performance degradation for certain types of degradation (further experimentation and analysis required).
Further validation of generalization performance in real-world applications is needed.
👍