Spherical Image Generation from a Single Image by Considering Scene Symmetry

Authors

  • Takayuki Hara The University of Tokyo
  • Yusuke Mukuta The University of Tokyo RIKEN
  • Tatsuya Harada The University of Tokyo RIKEN

DOI:

https://doi.org/10.1609/aaai.v35i2.16242

Keywords:

Computational Photography, Image & Video Synthesis, Applications, Neural Generative Models & Autoencoders

Abstract

Spherical images taken in all directions (360 degrees by 180 degrees) allow the full surroundings of a subject to be represented, providing an immersive experience to viewers. Generating a spherical image from a single normal-field-of-view (NFOV) image is convenient and expands the usage scenarios considerably without relying on a specific panoramic camera or images taken from multiple directions; however, achieving such images remains a challenging and unresolved problem. The primary challenge is controlling the high degree of freedom involved in generating a wide area that includes all directions of the desired spherical image. We focus on scene symmetry, which is a basic property of the global structure of spherical images, such as rotational symmetry, plane symmetry, and asymmetry. We propose a method for generating a spherical image from a single NFOV image and controlling the degree of freedom of the generated regions using the scene symmetry. To estimate and control the scene symmetry using both a circular shift and flip of the latent image features, we incorporate the intensity of the symmetry as a latent variable into conditional variational autoencoders. Our experiments show that the proposed method can generate various plausible spherical images controlled from symmetric to asymmetric, and can reduce the reconstruction errors of the generated images based on the estimated symmetry.

Downloads

Published

2021-05-18

How to Cite

Hara, T., Mukuta, Y., & Harada, T. (2021). Spherical Image Generation from a Single Image by Considering Scene Symmetry. Proceedings of the AAAI Conference on Artificial Intelligence, 35(2), 1513-1521. https://doi.org/10.1609/aaai.v35i2.16242

Issue

Section

AAAI Technical Track on Computer Vision I