Shape-Pose Ambiguity in Learning 3D Reconstruction from Images


  • Yunjie Wu Nanjing University
  • Zhengxing Sun Nanjing University
  • Youcheng Song Nanjing University
  • Yunhan Sun Nanjing University
  • YiJie Zhong Nanjing University
  • Jinlong Shi Jiangsu University of Science and Technology


3D Computer Vision


Learning single-image 3D reconstruction with only 2D images supervision is a promising research topic. The main challenge in image-supervised 3D reconstruction is the shape-pose ambiguity, which means a 2D supervision can be explained by an erroneous 3D shape from an erroneous pose. It will introduce high uncertainty and mislead the learning process. Existed works rely on multi-view images or pose-aware annotations to resolve the ambiguity. In this paper, we propose to resolve the ambiguity without extra pose-aware labels or annotations. Our training data is single-view images from the same object category. To overcome the shape-pose ambiguity, we introduce a pose-independent GAN to learn the category-specific shape manifold from the image collections. With the learned shape space, we resolve the shape-pose ambiguity in original images by training a pseudo pose regressor. Finally, we learn a reconstruction network with both the common re-projection loss and a pose-independent discrimination loss, making the results plausible from all views. Through experiments on synthetic and real image datasets, we demonstrate that our method can perform comparably to existing methods while not requiring any extra pose-aware annotations, making it more applicable and adaptable.




How to Cite

Wu, Y., Sun, Z., Song, Y., Sun, Y., Zhong, Y., & Shi, J. (2021). Shape-Pose Ambiguity in Learning 3D Reconstruction from Images. Proceedings of the AAAI Conference on Artificial Intelligence, 35(4), 2978-2985. Retrieved from



AAAI Technical Track on Computer Vision III