STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

Uttaran Bhattacharya; Trisha Mittal; Rohan Chandra; Tanmay Randhavane; Aniket Bera; Dinesh Manocha

doi:10.1609/aaai.v34i02.5490

Authors

Uttaran Bhattacharya University of Maryland, College Park
Trisha Mittal University of Maryland, College Park
Rohan Chandra University of Maryland, College Park
Tanmay Randhavane University of North Carolina, Chapel Hill
Aniket Bera University of Maryland, College Park
Dinesh Manocha University of Maryland, College Park

DOI:

https://doi.org/10.1609/aaai.v34i02.5490

Abstract

We present a novel classifier network called STEP, to classify perceived human emotion from gaits, based on a Spatial Temporal Graph Convolutional Network (ST-GCN) architecture. Given an RGB video of an individual walking, our formulation implicitly exploits the gait features to classify the perceived emotion of the human into one of four emotions: happy, sad, angry, or neutral. We train STEP on annotated real-world gait videos, augmented with annotated synthetic gaits generated using a novel generative network called STEP-Gen, built on an ST-GCN based Conditional Variational Autoencoder (CVAE). We incorporate a novel push-pull regularization loss in the CVAE formulation of STEP-Gen to generate realistic gaits and improve the classification accuracy of STEP. We also release a novel dataset (E-Gait), which consists of 4,227 human gaits annotated with perceived emotions along with thousands of synthetic gaits. In practice, STEP can learn the affective features and exhibits classification accuracy of 88% on E-Gait, which is 14–30% more accurate over prior methods.

STEP: Spatial Temporal Graph Convolutional Networks for Emotion Perception from Gaits

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription