Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation


  • Jilin Tang NetEase Fuxi AI Lab
  • Yi Yuan NetEase Fuxi AI Lab
  • Tianjia Shao Zhejiang University
  • Yong Liu Zhejiang University
  • Mengmeng Wang Zhejiang University
  • Kun Zhou Zhejiang University



Computational Photography, Image & Video Synthesis


In this paper we tackle the problem of pose guided person image generation, which aims to transfer a person image from the source pose to a novel target pose while maintaining the source appearance. Given the inefficiency of standard CNNs in handling large spatial transformation, we propose a structure-aware flow based method for high-quality person image generation. Specifically, instead of learning the complex overall pose changes of human body, we decompose the human body into different semantic parts (e.g., head, torso, and legs) and apply different networks to predict the flow fields for these parts separately. Moreover, we carefully design the network modules to effectively capture the local and global semantic correlations of features within and among the human parts respectively. Extensive experimental results show that our method can generate high-quality results under large pose discrepancy and outperforms state-of-the-art methods in both qualitative and quantitative comparisons.




How to Cite

Tang, J., Yuan, Y., Shao, T., Liu, Y., Wang, M., & Zhou, K. (2021). Structure-aware Person Image Generation with Pose Decomposition and Semantic Correlation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(3), 2656-2664.



AAAI Technical Track on Computer Vision II