RSGNet: Relation based Skeleton Graph Network for Crowded Scenes Pose Estimation

Yan Dai; Xuanhan Wang; Lianli Gao; Jingkuan Song; Heng Tao Shen

doi:10.1609/aaai.v35i2.16206

Authors

Yan Dai University of Electronic Science and Technology of China
Xuanhan Wang University of Electronic Science and Technology of China
Lianli Gao University of Electronic Science and Technology of China
Jingkuan Song University of Electronic Science and Technology of China, Key Laboratory of Artificial Intelligence, Ministry of Education
Heng Tao Shen University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v35i2.16206

Keywords:

Biometrics, Face, Gesture & Pose, (Deep) Neural Network Algorithms

Abstract

Despite of the recent great progress on multi-person pose estimation, existing solutions still remain challenging under the condition of "crowded scenes'', where RGB images capture complex real-world scenes with highly-overlapped people, severe occlusions and diverse postures. In this work, we focus on two main problems: 1) how to design an effective pipeline for crowded scenes pose estimation; and 2) how to equip this pipeline with the ability of relation modeling for interference resolving. To tackle these problems, we propose a new pipeline named Relation based Skeleton Graph Network (RSGNet). Unlike existing works that directly predict joints-of-target by labeling joints-of-interference as false positive, we first encourage all joints to be predicted. And then, a Target-aware Relation Parser (TRP) is designed to model the relation over all predicted joints, resulting in a target-aware encoding. This new pipeline will largely relieve the confusion of the joints estimation model when seeing identical joints with totally distinct labels (e.g., the identical hand exists in two bounding boxes). Furthermore, we introduce a Skeleton Graph Machine (SGM) to model the skeleton-based commonsense knowledge, aiming to estimate the target pose with the constraint of human body structure. Such skeleton-based constraint can help to deal with the challenges in crowded scenes from a reasoning perspective. Solid experiments on pose estimation benchmarks demonstrate that our method outperforms existing state-of-the-art methods.

RSGNet: Relation based Skeleton Graph Network for Crowded Scenes Pose Estimation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information