Target Scanpath-Guided 360-Degree Image Enhancement

Authors

  • Yujia Wang Victoria University of Wellington
  • Fang-Lue Zhang Victoria University of Wellington
  • Neil A. Dodgson Victoria University of Wellington

DOI:

https://doi.org/10.1609/aaai.v39i8.32881

Abstract

360° images have wide applications in fields such as virtual reality and user experience design. Our goal is to adjust these images to guide users' visual attention. To achieve this, we present a novel task: target scanpath-guided 360° image enhancement, which aims to enhance 360° images based on user-specified target scanpaths. We develop a Progressive Scanpath-Guided Enhancement Method (PSEM) to address this problem through three stages. In the first stage, we propose a Time-Alignment and Spatial Similarity Clustering (TASSC) algorithm that accounts for the spherical nature of 360° images and the temporal dependency of scanpaths to generate representative scanpaths. In the second stage, we learn the differences between the source and the target scanpaths and select the objects to be edited based on these differences. Particularly, we propose a Dual-Stream Scanpath Difference Encoder (DSDE) embedded into the Segment Anything Model (SAM) network for object mask generation. Finally, we employ a Stable Diffusion network fine-tuned with LoRA technology to produce the final enhanced image. Additionally, we design special loss functions to supervise the training of the second and third stages. Experimental results have demonstrated the effectiveness of our approach for scanpath-guided 360° image enhancement.

Downloads

Published

2025-04-11

How to Cite

Wang, Y., Zhang, F.-L., & Dodgson, N. A. (2025). Target Scanpath-Guided 360-Degree Image Enhancement. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8169–8177. https://doi.org/10.1609/aaai.v39i8.32881

Issue

Section

AAAI Technical Track on Computer Vision VII