Diverse and Stable 2D Diffusion Guided Text to 3D Generation with Noise Recalibration

Authors

  • Xiaofeng Yang Nanyang Technological University, Singapore
  • Fayao Liu Institute for Infocomm Research, A*STAR, Singapore
  • Yi Xu OPPO US Research Center, USA
  • Hanjing Su Tencent, China
  • Qingyao Wu South China University of Technology, China
  • Guosheng Lin Nanyang Technological University, Singapore

DOI:

https://doi.org/10.1609/aaai.v38i7.28476

Keywords:

CV: 3D Computer Vision, CV: Language and Vision

Abstract

In recent years, following the success of text guided image generation, text guided 3D generation has gained increasing attention among researchers. Dreamfusion is a notable approach that enhances generation quality by utilizing 2D text guided diffusion models and introducing SDS loss, a technique for distilling 2D diffusion model information to train 3D models. However, the SDS loss has two major limitations that hinder its effectiveness. Firstly, when given a text prompt, the SDS loss struggles to produce diverse content. Secondly, during training, SDS loss may cause the generated content to overfit and collapse, limiting the model's ability to learn intricate texture details. To overcome these challenges, we propose a novel approach called Noise Recalibration algorithm. By incorporating this technique, we can generate 3D content with significantly greater diversity and stunning details. Our approach offers a promising solution to the limitations of SDS loss.

Published

2024-03-24

How to Cite

Yang, X., Liu, F., Xu, Y., Su, H., Wu, Q., & Lin, G. (2024). Diverse and Stable 2D Diffusion Guided Text to 3D Generation with Noise Recalibration. Proceedings of the AAAI Conference on Artificial Intelligence, 38(7), 6549-6557. https://doi.org/10.1609/aaai.v38i7.28476

Issue

Section

AAAI Technical Track on Computer Vision VI