3DHumanEdit: Multi-modal Body Part-aware Conditioning Information Integration for 3D Human Manipulation

Authors

  • FeiFan Xu South China University of Technology
  • Tianyi Chen City University of Hong Kong
  • Fan Yang Nanyang Technological University
  • Yunfei Zhang South China University of Technology
  • Si Wu South China University of Technology

DOI:

https://doi.org/10.1609/aaai.v39i8.32955

Abstract

The rapid advancement of 3D Generative Adversarial Networks (GANs) has significantly enhanced the diversity and quality of generated 3D images. Despite these breakthroughs, the manipulation capabilities of 3D GANs remain unexplored, presenting substantial challenges for practical applications where user interaction and modification are essential. Current manipulation methods often lack the precision needed for fine-grained attribute manipulation, and struggle to maintain multi-view consistency during the editing process. To address these limitations, we propose 3DHumanEdit, a novel approach for 3D human body part-aware manipulation. 3DHumanEdit leverages multi-modal feature fusion and body part-aware feature alignment to achieve precise manipulation of individual body parts based on detailed text inputs and segmentation images. By exploring 3D prior for accurate editing and enforcing correspondence in latent space, 3DHumanEdit ensures coherence across multiple views. Experiments demonstrate that 3DHumanEdit outperforms existing methods in both editing fidelity and multi-view consistency, offering a robust solution for fine-grained 3D manipulation.

Published

2025-04-11

How to Cite

Xu, F., Chen, T., Yang, F., Zhang, Y., & Wu, S. (2025). 3DHumanEdit: Multi-modal Body Part-aware Conditioning Information Integration for 3D Human Manipulation. Proceedings of the AAAI Conference on Artificial Intelligence, 39(8), 8833–8841. https://doi.org/10.1609/aaai.v39i8.32955

Issue

Section

AAAI Technical Track on Computer Vision VII