NeRF-VPT: Learning Novel View Representations with Neural Radiance Fields via View Prompt Tuning

Authors

  • Linsheng Chen Sun Yat-sen University
  • Guangrun Wang University of Oxford
  • Liuchun Yuan Sun Yat-sen University
  • Keze Wang Sun Yat-sen University
  • Ken Deng Sun Yat-sen University
  • Philip H.S. Torr University of Oxford

DOI:

https://doi.org/10.1609/aaai.v38i2.27877

Keywords:

CV: 3D Computer Vision

Abstract

Neural Radiance Fields (NeRF) have garnered remarkable success in novel view synthesis. Nonetheless, the task of generating high-quality images for novel views persists as a critical challenge. While the existing efforts have exhibited commendable progress, capturing intricate details, enhancing textures, and achieving superior Peak Signal-to-Noise Ratio (PSNR) metrics warrant further focused attention and advancement. In this work, we propose NeRF-VPT, an innovative method for novel view synthesis to address these challenges. Our proposed NeRF-VPT employs a cascading view prompt tuning paradigm, wherein RGB information gained from preceding rendering outcomes serves as instructive visual prompts for subsequent rendering stages, with the aspiration that the prior knowledge embedded in the prompts can facilitate the gradual enhancement of rendered image quality. NeRF-VPT only requires sampling RGB data from previous stage renderings as priors at each training stage, without relying on extra guidance or complex techniques. Thus, our NeRF-VPT is plug-and-play and can be readily integrated into existing methods. By conducting comparative analyses of our NeRF-VPT against several NeRF-based approaches on demanding real-scene benchmarks, such as Realistic Synthetic 360, Real Forward-Facing, Replica dataset, and a user-captured dataset, we substantiate that our NeRF-VPT significantly elevates baseline performance and proficiently generates more high-quality novel view images than all the compared state-of-the-art methods. Furthermore, the cascading learning of NeRF-VPT introduces adaptability to scenarios with sparse inputs, resulting in a significant enhancement of accuracy for sparse-view novel view synthesis. The source code and dataset are available at https://github.com/Freedomcls/NeRF-VPT.

Published

2024-03-24

How to Cite

Chen, L., Wang, G., Yuan, L., Wang, K., Deng, K., & Torr, P. H. (2024). NeRF-VPT: Learning Novel View Representations with Neural Radiance Fields via View Prompt Tuning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1156-1164. https://doi.org/10.1609/aaai.v38i2.27877

Issue

Section

AAAI Technical Track on Computer Vision I