Joint Human Pose Estimation and Instance Segmentation with PosePlusSeg


  • Niaz Ahmad Hanyang University
  • Jawad Khan Hanyang University
  • Jeremy Yuhyun Kim Hanyang University
  • Youngmoon Lee Hanyang University



Computer Vision (CV), Machine Learning (ML), Intelligent Robotics (ROB), Humans And AI (HAI)


Despite the advances in multi-person pose estimation, state-of-the-art techniques only deliver the human pose structure.Yet, they do not leverage the keypoints of human pose to deliver whole-body shape information for human instance segmentation. This paper presents PosePlusSeg, a joint model designed for both human pose estimation and instance segmentation. For pose estimation, PosePlusSeg first takes a bottom-up approach to detect the soft and hard keypoints of individuals by producing a strong keypoint heat map, then improves the keypoint detection confidence score by producing a body heat map. For instance segmentation, PosePlusSeg generates a mask offset where keypoint is defined as a centroid for the pixels in the embedding space, enabling instance-level segmentation for the human class. Finally, we propose a new pose and instance segmentation algorithm that enables PosePlusSeg to determine the joint structure of the human pose and instance segmentation. Experiments using the COCO challenging dataset demonstrate that PosePlusSeg copes better with challenging scenarios, like occlusions, en-tangled limbs, and overlapped people. PosePlusSeg outperforms state-of-the-art detection-based approaches achieving a 0.728 mAP for human pose estimation and a 0.445 mAP for instance segmentation. Code has been made available at:




How to Cite

Ahmad, N., Khan, J., Kim, J. Y., & Lee, Y. (2022). Joint Human Pose Estimation and Instance Segmentation with PosePlusSeg. Proceedings of the AAAI Conference on Artificial Intelligence, 36(1), 69-76.



AAAI Technical Track on Computer Vision I