Stereo Vision Conversion from Planar Videos Based on Temporal Multiplane Images

Authors

  • Shanding Diao School of Computer and Information, Hefei University of Technology, Hefei 230009, China
  • Yuan Chen School of Internet, Anhui University, Hefei 230039, China
  • Yang Zhao School of Computer and Information, Hefei University of Technology, Hefei 230009, China Peng Cheng National Laboratory, Shenzhen 518000, China
  • Wei Jia School of Computer and Information, Hefei University of Technology, Hefei 230009, China
  • Zhao Zhang School of Computer and Information, Hefei University of Technology, Hefei 230009, China
  • Ronggang Wang Peng Cheng National Laboratory, Shenzhen 518000, China School of Electronic and Computer Engineering, Peking University Shenzhen Graduate School, Shenzhen 518055, China

DOI:

https://doi.org/10.1609/aaai.v38i2.27917

Keywords:

CV: 3D Computer Vision, CV: Scene Analysis & Understanding

Abstract

With the rapid development of 3D movie and light-field displays, there is a growing demand for stereo videos. However, generating high-quality stereo videos from planar videos remains a challenging task. Traditional depth-image-based rendering techniques struggle to effectively handle the problem of occlusion exposure, which occurs when the occluded contents become visible in other views. Recently, the single-view multiplane images (MPI) representation has shown promising performance for planar video stereoscopy. However, the MPI still lacks real details that are occluded in the current frame, resulting in blurry artifacts in occlusion exposure regions. In fact, planar videos can leverage complementary information from adjacent frames to predict a more complete scene representation for the current frame. Therefore, this paper extends the MPI from still frames to the temporal domain, introducing the temporal MPI (TMPI). By extracting complementary information from adjacent frames based on optical flow guidance, obscured regions in the current frame can be effectively repaired. Additionally, a new module called masked optical flow warping (MOFW) is introduced to improve the propagation of pixels along optical flow trajectories. Experimental results demonstrate that the proposed method can generate high-quality stereoscopic or light-field videos from a single view and reproduce better occluded details than other state-of-the-art (SOTA) methods. https://github.com/Dio3ding/TMPI

Published

2024-03-24

How to Cite

Diao, S., Chen, Y., Zhao, Y., Jia, W., Zhang, Z., & Wang, R. (2024). Stereo Vision Conversion from Planar Videos Based on Temporal Multiplane Images. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1519–1527. https://doi.org/10.1609/aaai.v38i2.27917

Issue

Section

AAAI Technical Track on Computer Vision I