Unsupervised Representation Learning With Long-Term Dynamics for Skeleton Based Action Recognition
DOI:
https://doi.org/10.1609/aaai.v32i1.11853Keywords:
Unsupervised Learning, Action Recognition, RNN, GANAbstract
In recent years, skeleton based action recognition is becoming an increasingly attractive alternative to existing video-based approaches, beneficial from its robust and comprehensive 3D information. In this paper, we explore an unsupervised representation learning approach for the first time to capture the long-term global motion dynamics in skeleton sequences. We design a conditional skeleton inpainting architecture for learning a fixed-dimensional representation, guided by additional adversarial training strategies. We quantitatively evaluate the effectiveness of our learning approach on three well-established action recognition datasets. Experimental results show that our learned representation is discriminative for classifying actions and can substantially reduce the sequence inpainting errors.