Understanding Deformable Alignment in Video Super-Resolution


  • Kelvin C.K. Chan S-Lab, Nanyang Technological University
  • Xintao Wang Applied Research Center, Tencent PCG
  • Ke Yu CUHK – SenseTime Joint Lab, The Chinese University of Hong Kong
  • Chao Dong Shenzhen Key Lab of Computer Vision and Pattern Recognition, SIAT-SenseTime Joint Lab, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences SIAT Branch, Shenzhen Institute of Artificial Intelligence and Robotics for Society
  • Chen Change Loy S-Lab, Nanyang Technological University




Low Level & Physics-based Vision


Deformable convolution, originally proposed for the adaptation to geometric variations of objects, has recently shown compelling performance in aligning multiple frames and is increasingly adopted for video super-resolution. Despite its remarkable performance, its underlying mechanism for alignment remains unclear. In this study, we carefully investigate the relation between deformable alignment and the classic flow-based alignment. We show that deformable convolution can be decomposed into a combination of spatial warping and convolution. This decomposition reveals the commonality of deformable alignment and flow-based alignment in formulation, but with a key difference in their offset diversity. We further demonstrate through experiments that the increased diversity in deformable alignment yields better-aligned features, and hence significantly improves the quality of video super-resolution output. Based on our observations, we propose an offset-fidelity loss that guides the offset learning with optical flow. Experiments show that our loss successfully avoids the overflow of offsets and alleviates the instability problem of deformable alignment. Aside from the contributions to deformable alignment, our formulation inspires a more flexible approach to introduce offset diversity to flow-based alignment, improving its performance.




How to Cite

Chan, K. C., Wang, X., Yu, K., Dong, C., & Loy, C. C. (2021). Understanding Deformable Alignment in Video Super-Resolution. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 973-981. https://doi.org/10.1609/aaai.v35i2.16181



AAAI Technical Track on Computer Vision I