QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution

Authors

  • Bowen Chai Shanghai Jiao Tong University
  • Zheng Chen Shanghai Jiao Tong University
  • Libo Zhu Shanghai Jiao Tong University
  • Wenbo Li Joy Future Academy
  • Yong Guo Max Planck Institute for Informatics
  • Yulun Zhang Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v40i4.37257

Abstract

Diffusion models have shown superior performance in real-world video super-resolution (VSR). However, the slow processing speeds and heavy resource consumption of diffusion models hinder their practical application and deployment. Quantization offers a potential solution for compressing the VSR model. Nevertheless, quantizing VSR models is challenging due to their temporal characteristics and high fidelity requirements. To address these issues, we propose QuantVSR, a low-bit quantization model for real-world VSR. We propose a spatio-temporal complexity aware (STCA) mechanism, where we first utilize the calibration dataset to measure both spatial and temporal complexities for each layer. Based on these statistics, we allocate layer-specific ranks to the low-rank full-precision (FP) auxiliary branch. Subsequently, we jointly refine the FP and low-bit branches to achieve simultaneous optimization. In addition, we propose a learnable bias alignment (LBA) module to reduce the biased quantization errors. Extensive experiments on synthetic and real-world datasets demonstrate that our method obtains comparable performance with the FP model and significantly outperforms recent leading low-bit quantization methods.

Published

2026-03-14

How to Cite

Chai, B., Chen, Z., Zhu, L., Li, W., Guo, Y., & Zhang, Y. (2026). QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution. Proceedings of the AAAI Conference on Artificial Intelligence, 40(4), 2689–2697. https://doi.org/10.1609/aaai.v40i4.37257

Issue

Section

AAAI Technical Track on Computer Vision I