A Geometric Perspective on Optimizing Vector Quantized Latent Diffusion Model for Image Restoration

Chen Hang; Haoming Chen; Xuwei Fang; Weisheng Xie; Xiangxiang Gao; Faming Fang; Guixu Zhang; Haichuan Song

doi:10.1609/aaai.v40i6.42462

Authors

Chen Hang East China Normal University
Haoming Chen East China Normal University
Xuwei Fang Bestpay AI Lab
Weisheng Xie Bestpay AI Lab
Xiangxiang Gao Bestpay AI Lab
Faming Fang East China Normal University
Guixu Zhang East China Normal University
Haichuan Song East China Normal University

DOI:

https://doi.org/10.1609/aaai.v40i6.42462

Abstract

In this paper, we investigate the limitations of the Vector Quantized Latent Diffusion Model (VQ-LDM) in restoration tasks. We identify a performance gap between the Vector Quantization (VQ) and Diffusion Model components, manifested as a significant discrepancy between the reconstruction quality of ground truth images processed via VQ autoregression and degraded images restored by VQ-LDM. Through experiments, we attribute this gap primarily to the lack of robustness in the mapped points of VQ within the original VQ-LDM framework. To address this issue, we propose a geometric based optimization approach. First, we introduce a simple yet effective method, termed interpolation-based latent initial state optimization, which mitigates the performance gap by replacing the original mapped points with interpolated values, supported by theoretical analysis. Here, the latent initial state refers specifically to the input of the diffusion model. Building upon this, we further propose a Chebyshev center-based latent initial state optimization, an elegant theoretical solution from a geometric perspective, that further enhances restoration performance. Our improvements consistently achieve superior results across nine benchmark datasets.

A Geometric Perspective on Optimizing Vector Quantized Latent Diffusion Model for Image Restoration

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information