Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation

Yang Li; Kan Li; Shuai Jiang; Ziyue Zhang; Congzhentao Huang; Richard Yi Da Xu

doi:10.1609/aaai.v34i07.6808

Authors

Yang Li Beijing Institute of Technology
Kan Li Beijing Institute of Technology
Shuai Jiang University of Technology Sydney
Ziyue Zhang University of Technology Sydney
Congzhentao Huang University of Technology Sydney
Richard Yi Da Xu University of Technology Sydney

DOI:

https://doi.org/10.1609/aaai.v34i07.6808

Abstract

The neural network based approach for 3D human pose estimation from monocular images has attracted growing interest. However, annotating 3D poses is a labor-intensive and expensive process. In this paper, we propose a novel self-supervised approach to avoid the need of manual annotations. Different from existing weakly/self-supervised methods that require extra unpaired 3D ground-truth data to alleviate the depth ambiguity problem, our method trains the network only relying on geometric knowledge without any additional 3D pose annotations. The proposed method follows the two-stage pipeline: 2D pose estimation and 2D-to-3D pose lifting. We design the transform re-projection loss that is an effective way to explore multi-view consistency for training the 2D-to-3D lifting network. Besides, we adopt the confidences of 2D joints to integrate losses from different views to alleviate the influence of noises caused by the self-occlusion problem. Finally, we design a two-branch training architecture, which helps to preserve the scale information of re-projected 2D poses during training, resulting in accurate 3D pose predictions. We demonstrate the effectiveness of our method on two popular 3D human pose datasets, Human3.6M and MPI-INF-3DHP. The results show that our method significantly outperforms recent weakly/self-supervised approaches.

Geometry-Driven Self-Supervised Method for 3D Human Pose Estimation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information