Self-Supervised Joint Dynamic Scene Reconstruction and Optical Flow Estimation for Spiking Camera

Authors

  • Shiyan Chen Peking University
  • Zhaofei Yu Peking University
  • Tiejun Huang Peking University

DOI:

https://doi.org/10.1609/aaai.v37i1.25108

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Low Level & Physics-Based Vision, ML: Unsupervised & Self-Supervised Learning

Abstract

Spiking camera, a novel retina-inspired vision sensor, has shown its great potential for capturing high-speed dynamic scenes with a sampling rate of 40,000 Hz. The spiking camera abandons the concept of exposure window, with each of its photosensitive units continuously capturing photons and firing spikes asynchronously. However, the special sampling mechanism prevents the frame-based algorithm from being used to spiking camera. It remains to be a challenge to reconstruct dynamic scenes and perform common computer vision tasks for spiking camera. In this paper, we propose a self-supervised joint learning framework for optical flow estimation and reconstruction of spiking camera. The framework reconstructs clean frame-based spiking representations in a self-supervised manner, and then uses them to train the optical flow networks. We also propose an optical flow based inverse rendering process to achieve self-supervision by minimizing the difference with respect to the original spiking temporal aggregation image. The experimental results demonstrate that our method bridges the gap between synthetic and real-world scenes and achieves desired results in real-world scenarios. To the best of our knowledge, this is the first attempt to jointly reconstruct dynamic scenes and estimate optical flow for spiking camera from a self-supervised learning perspective.

Downloads

Published

2023-06-26

How to Cite

Chen, S., Yu, Z., & Huang, T. (2023). Self-Supervised Joint Dynamic Scene Reconstruction and Optical Flow Estimation for Spiking Camera. Proceedings of the AAAI Conference on Artificial Intelligence, 37(1), 350-358. https://doi.org/10.1609/aaai.v37i1.25108

Issue

Section

AAAI Technical Track on Computer Vision I