HORIZON: High-Resolution Semantically Controlled Panorama Synthesis

Authors

  • Kun Yan SKLSDE Lab, Beihang University
  • Lei Ji Microsoft Research Asia
  • Chenfei Wu Microsoft Research Asia
  • Jian Liang Peking University
  • Ming Zhou Langboat Technology
  • Nan Duan Microsoft Research Asia
  • Shuai Ma SKLSDE Lab, Beihang University

DOI:

https://doi.org/10.1609/aaai.v38i6.28463

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: 3D Computer Vision, CV: Applications, CV: Multi-modal Vision, NLP: Language Grounding & Multi-modal NLP

Abstract

Panorama synthesis endeavors to craft captivating 360-degree visual landscapes, immersing users in the heart of virtual worlds. Nevertheless, contemporary panoramic synthesis techniques grapple with the challenge of semantically guiding the content generation process. Although recent breakthroughs in visual synthesis have unlocked the potential for semantic control in 2D flat images, a direct application of these methods to panorama synthesis yields distorted content. In this study, we unveil an innovative framework for generating high-resolution panoramas, adeptly addressing the issues of spherical distortion and edge discontinuity through sophisticated spherical modeling. Our pioneering approach empowers users with semantic control, harnessing both image and text inputs, while concurrently streamlining the generation of high-resolution panoramas using parallel decoding. We rigorously evaluate our methodology on a diverse array of indoor and outdoor datasets, establishing its superiority over recent related work, in terms of both quantitative and qualitative performance metrics. Our research elevates the controllability, efficiency, and fidelity of panorama synthesis to new levels.

Downloads

Published

2024-03-24

How to Cite

Yan, K., Ji, L., Wu, C., Liang, J., Zhou, M., Duan, N., & Ma, S. (2024). HORIZON: High-Resolution Semantically Controlled Panorama Synthesis. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 6431-6439. https://doi.org/10.1609/aaai.v38i6.28463

Issue

Section

AAAI Technical Track on Computer Vision V