Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

Authors

  • Chaolong Li Southeast University
  • Zhen Cui Nanjing University of Science and Technology
  • Wenming Zheng Southeast University
  • Chunyan Xu Nanjing University of Science and Technology
  • Jian Yang Nanjing University of Science and Technology

Abstract

Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain. The proposed model is generic and principled as it can be generalized into other dynamic models. We theoretically prove the stability of STGC and provide an upper-bound of the signal transformation to be learnt. Further, the proposed recursive model can be stacked into a multi-layer architecture. To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art.

Downloads

Published

2018-04-29

How to Cite

Li, C., Cui, Z., Zheng, W., Xu, C., & Yang, J. (2018). Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/11776