Understanding Deformable Alignment in Video Super-Resolution

Authors

  • Kelvin C.K. Chan S-Lab, Nanyang Technological University
  • Xintao Wang Applied Research Center, Tencent PCG
  • Ke Yu CUHK – SenseTime Joint Lab, The Chinese University of Hong Kong
  • Chao Dong Shenzhen Key Lab of Computer Vision and Pattern Recognition, SIAT-SenseTime Joint Lab, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society
  • Chen Change Loy S-Lab, Nanyang Technological University

Keywords:

Low Level & Physics-based Vision

Abstract

Deformable convolution, originally proposed for the adaptation to geometric variations of objects, has recently shown compelling performance in aligning multiple frames and is increasingly adopted for video super-resolution. Despite its remarkable performance, its underlying mechanism for alignment remains unclear. In this study, we carefully investigate the relation between deformable alignment and the classic flow-based alignment. We show that deformable convolution can be decomposed into a combination of spatial warping and convolution. This decomposition reveals the commonality of deformable alignment and flow-based alignment in formulation, but with a key difference in their offset diversity. We further demonstrate through experiments that the increased diversity in deformable alignment yields better-aligned features, and hence significantly improves the quality of video super-resolution output. Based on our observations, we propose an offset-fidelity loss that guides the offset learning with optical flow. Experiments show that our loss successfully avoids the overflow of offsets and alleviates the instability problem of deformable alignment. Aside from the contributions to deformable alignment, our formulation inspires a more flexible approach to introduce offset diversity to flow-based alignment, improving its performance.

Downloads

Published

2021-05-18

How to Cite

Chan, K. C., Wang, X., Yu, K., Dong, C., & Loy, C. C. (2021). Understanding Deformable Alignment in Video Super-Resolution. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 973-981. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/16181

Issue

Section

AAAI Technical Track on Computer Vision I