Fast Multi-view Consistent 3D Editing with Video Priors

Liyi Chen; Ruihuang Li; Guowen Zhang; Pengfei Wang; Lei Zhang

doi:10.1609/aaai.v40i4.37286

Authors

Liyi Chen The Hong Kong Polytechnic University
Ruihuang Li The Hong Kong Polytechnic University
Guowen Zhang The Hong Kong Polytechnic University
Pengfei Wang The Hong Kong Polytechnic University
Lei Zhang The Hong Kong Polytechnic University

DOI:

https://doi.org/10.1609/aaai.v40i4.37286

Abstract

Text-driven 3D editing enables user-friendly 3D object or scene editing with text instructions. Due to the lack of multi-view consistency priors, existing methods typically resort to employ 2D generation or editing models to process per-view individually, followed by iterative 2D-3D-2D updating. However, these methods are not only time-consuming but also prone to yielding over-smoothed results, since iterative process averages the different editing signals gathered from different views. In this paper, we propose, an early and pioneering work of generative Video Prior based 3D Editing, ViP3DE in short, to repurpose the temporal consistency priors from pre-trained video generation models to achieve consistent 3D editing within a single forward pass. Our key insight is to condition the video generation model on a single edited view to generate other consistent edited views for 3D updating directly, thereby bypassing iterative editing paradigm. First, 3D updating requires edited views to be paired with specific camera poses. To this end, we propose \textit{motion-preserved noise blending} for the video model to generate edited views at predefined camera poses. In addition, we introduce \textit{geometrically aware denoising} to further enhance multi-view consistency by integrating 3D geometric priors into video models. Extensive experiments demonstrate that our proposed ViP3DE can achieve high-quality 3D editing results even within a single forward pass, significantly outperforming existing methods in both editing quality and editing time cost.

Fast Multi-view Consistent 3D Editing with Video Priors

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information