SphereDiff: Tuning-free 360° Static and Dynamic Panorama Generation via Spherical Latent Representation

Authors

  • Minho Park Korea Advanced Institute of Science and Technology
  • Taewoong Kang Korea Advanced Institute of Science and Technology
  • Jooyeol Yun Korea Advanced Institute of Science and Technology
  • Sungwon Hwang Korea Advanced Institute of Science and Technology
  • Jaegul Choo Korea Advanced Institute of Science and Technology

DOI:

https://doi.org/10.1609/aaai.v40i10.37779

Abstract

The increasing demand for AR/VR applications has highlighted the need for high-quality content, such as 360° live wallpapers. However, generating high-quality 360° panoramic contents remains a challenging task due to the severe distortions introduced by equirectangular projection (ERP). Existing approaches either fine-tune pretrained diffusion models on limited ERP datasets or adopt tuning-free methods that still rely on ERP latent representations, often resulting in distracting distortions near the poles. In this paper, we introduce SphereDiff, a novel approach for synthesizing 360° static and live wallpaper with state-of-the-art diffusion models without additional tuning. We define a spherical latent representation that ensures consistent quality across all perspectives, including near the poles. Then, we extend MultiDiffusion to spherical latent representation and propose a dynamic spherical latent sampling method to enable direct use of pretrained diffusion models. Moreover, we introduce distortion-aware weighted averaging to further improve the generation quality. Our method outperforms existing approaches in generating 360° static and live wallpaper, making it a robust solution for immersive AR/VR applications.

Published

2026-03-14

How to Cite

Park, M., Kang, T., Yun, J., Hwang, S., & Choo, J. (2026). SphereDiff: Tuning-free 360° Static and Dynamic Panorama Generation via Spherical Latent Representation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 8305-8313. https://doi.org/10.1609/aaai.v40i10.37779

Issue

Section

AAAI Technical Track on Computer Vision VII