SVFI: Spiking-Based Video Frame Interpolation for High-Speed Motion

Authors

  • Lujie Xia National Engineering Research Center of Visual Technology (NERCVT), Peking University Institute of Digital Media, School of Computer Science, Peking University
  • Jing Zhao National Engineering Research Center of Visual Technology (NERCVT), Peking University Institute of Digital Media, School of Computer Science, Peking University National Computer Network Emergency Response Technical Team/Coordination Center of China
  • Ruiqin Xiong National Engineering Research Center of Visual Technology (NERCVT), Peking University Institute of Digital Media, School of Computer Science, Peking University
  • Tiejun Huang National Engineering Research Center of Visual Technology (NERCVT), Peking University Institute of Digital Media, School of Computer Science, Peking University Beijing Academy of Artificial Intelligence

DOI:

https://doi.org/10.1609/aaai.v37i3.25393

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Low Level & Physics-Based Vision

Abstract

Occlusion and motion blur make it challenging to interpolate video frame, since estimating complex motions between two frames is hard and unreliable, especially in highly dynamic scenes. This paper aims to address these issues by exploiting spike stream as auxiliary visual information between frames to synthesize target frames. Instead of estimating motions by optical flow from RGB frames, we present a new dual-modal pipeline adopting both RGB frames and the corresponding spike stream as inputs (SVFI). It extracts the scene structure and objects' outline feature maps of the target frames from spike stream. Those feature maps are fused with the color and texture feature maps extracted from RGB frames to synthesize target frames. Benefited by the spike stream that contains consecutive information between two frames, SVFI can directly extract the information in occlusion and motion blur areas of target frames from spike stream, thus it is more robust than previous optical flow-based methods. Experiments show SVFI outperforms the SOTA methods on wide variety of datasets. For instance, in 7 and 15 frame skip evaluations, it shows up to 5.58 dB and 6.56 dB improvements in terms of PSNR over the corresponding second best methods BMBC and DAIN. SVFI also shows visually impressive performance in real-world scenes.

Downloads

Published

2023-06-26

How to Cite

Xia, L., Zhao, J., Xiong, R., & Huang, T. (2023). SVFI: Spiking-Based Video Frame Interpolation for High-Speed Motion. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 2910-2918. https://doi.org/10.1609/aaai.v37i3.25393

Issue

Section

AAAI Technical Track on Computer Vision III