SkipDiff: Adaptive Skip Diffusion Model for High-Fidelity Perceptual Image Super-resolution

Authors

  • Xiaotong Luo Xiamen University
  • Yuan Xie East China Normal University
  • Yanyun Qu Xiamen University
  • Yun Fu Northeastern University

DOI:

https://doi.org/10.1609/aaai.v38i5.28195

Keywords:

CV: Low Level & Physics-based Vision, ML: Deep Generative Models & Autoencoders

Abstract

It is well-known that image quality assessment usually meets with the problem of perception-distortion (p-d) tradeoff. The existing deep image super-resolution (SR) methods either focus on high fidelity with pixel-level objectives or high perception with generative models. The emergence of diffusion model paves a fresh way for image restoration, which has the potential to offer a brand-new solution for p-d trade-off. We experimentally observed that the perceptual quality and distortion change in an opposite direction with the increase of sampling steps. In light of this property, we propose an adaptive skip diffusion model (SkipDiff), which aims to achieve high-fidelity perceptual image SR with fewer sampling steps. Specifically, it decouples the sampling procedure into coarse skip approximation and fine skip refinement stages. A coarse-grained skip diffusion is first performed as a high-fidelity prior to obtaining a latent approximation of the full diffusion. Then, a fine-grained skip diffusion is followed to further refine the latent sample for promoting perception, where the fine time steps are adaptively learned by deep reinforcement learning. Meanwhile, this approach also enables faster sampling of diffusion model through skipping the intermediate denoising process to shorten the effective steps of the computation. Extensive experimental results show that our SkipDiff achieves superior perceptual quality with plausible reconstruction accuracy and a faster sampling speed.

Published

2024-03-24

How to Cite

Luo, X., Xie, Y., Qu, Y., & Fu, Y. (2024). SkipDiff: Adaptive Skip Diffusion Model for High-Fidelity Perceptual Image Super-resolution. Proceedings of the AAAI Conference on Artificial Intelligence, 38(5), 4017-4025. https://doi.org/10.1609/aaai.v38i5.28195

Issue

Section

AAAI Technical Track on Computer Vision IV