Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration

Authors

  • Lianxin Xie School of Computer Science and Engineering, South China University of Technology
  • Bingbing Zheng School of Computer Science and Engineering, South China University of Technology
  • Wen Xue School of Computer Science and Engineering, South China University of Technology
  • Yunfei Zhang School of Computer Science and Engineering, South China University of Technology
  • Le Jiang School of Computer Science and Engineering, South China University of Technology
  • Ruotao Xu Institute of Super Robotics(Huangpu)
  • Si Wu School of Computer Science and Engineering, South China University of Technology Institute of Super Robotics(Huangpu)
  • Hau-San Wong Department of Computer Science, City University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v39i8.32944

Abstract

Blind face video restoration aims to restore high-fidelity details from videos subjected to complex and unknown degradations. This task poses a significant challenge of managing temporal heterogeneity while at the same time maintaining stable face attributes. In this paper, we introduce a Discrete Prior-based Temporal-Coherent content prediction transformer to address the challenge, and our model is referred to as DP-TempCoh. Specifically, we incorporate a spatial-temporal-aware content prediction module to synthesize high-quality content from discrete visual priors, conditioned on degraded video tokens. To further enhance the temporal coherence of the predicted content, a motion statistics modulation module is designed to adjust the content, based on discrete motion priors in terms of cross-frame mean and variance. As a result, the statistics of the predicted content can match with that of real videos over time. By performing extensive experiments, we verify the effectiveness of the design elements and demonstrate the superior performance of our DP-TempCoh in both synthetically and naturally degraded video restoration.

Downloads

Published

2025-04-11

How to Cite

Xie, L., Zheng, B., Xue, W., Zhang, Y., Jiang, L., Xu, R., … Wong, H.-S. (2025). Discrete Prior-Based Temporal-Coherent Content Prediction for Blind Face Video Restoration. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8736–8744. https://doi.org/10.1609/aaai.v39i8.32944

Issue

Section

AAAI Technical Track on Computer Vision VII