MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

Xu He; Zhiyong Wu; Xiaoyu Li; Di Kang; Chaopeng Zhang; Jiangnan Ye; Liyang Chen; Xiangjun Gao; Han Zhang; Haolin Zhuang

doi:10.1609/aaai.v39i3.32356

Authors

Xu He Shenzhen International Graduate School, Tsinghua University
Zhiyong Wu Shenzhen International Graduate School, Tsinghua University The Chinese University of Hong Kong
Xiaoyu Li Tencent
Di Kang Tencent
Chaopeng Zhang Tencent
Jiangnan Ye Shenzhen International Graduate School, Tsinghua University
Liyang Chen Shenzhen International Graduate School, Tsinghua University
Xiangjun Gao The Hong Kong University of Science and Technology
Han Zhang Stanford University
Haolin Zhuang Shenzhen International Graduate School, Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v39i3.32356

Abstract

Existing works in single-image human reconstruction suffer from weak generalizability due to insufficient training data or 3D inconsistencies for a lack of comprehensive multi-view knowledge. In this paper, we introduce MagicMan, a human-specific multi-view diffusion model to generate high-quality novel views from a single reference image. As its core, we leverage a pre-trained 2D diffusion model as the generative prior for generalizability, with the parametric SMPL-X model as the 3D body prior to promote 3D awareness. To maintain consistency while generating denser views for improved 3D human reconstruction, we introduce hybrid multi-view attention to facilitate efficient and thorough information interchange across views. Besides, we present a geometry-aware dual branch to perform concurrent generation in both RGB and normal domains, further enhancing consistency via geometry cues. Last but not least, to address ill-shaped issues arising from inaccurate SMPL-X estimation, we propose a novel iterative refinement strategy, which progressively optimizes SMPL-X accuracy while enhancing the quality and consistency of the generated multi-views. Extensive experimental results demonstrate that our method significantly outperforms existing approaches in both novel view synthesis and subsequent 3D human reconstruction tasks.

MagicMan: Generative Novel View Synthesis of Humans with 3D-Aware Diffusion and Iterative Refinement

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information