Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition
DOI:
https://doi.org/10.1609/aaai.v38i6.28440Keywords:
CV: Video Understanding & Activity Analysis, DMKM: Mining of Spatial, Temporal or Spatio-Temporal Data, HAI: Applications, ML: Deep Neural Architectures and Foundation Models, ML: Graph-based Machine LearningAbstract
Graph convolutional networks (GCNs) have attracted great attention and achieved remarkable performance in skeleton-based action recognition. However, most of the previous works are designed to refine skeleton topology without considering the types of different joints and edges, making them infeasible to represent the semantic information. In this paper, we proposed a dynamic semantic-based graph convolution network (DS-GCN) for skeleton-based human action recognition, where the joints and edge types were encoded in the skeleton topology in an implicit way. Specifically, two semantic modules, the joints type-aware adaptive topology and the edge type-aware adaptive topology, were proposed. Combining proposed semantics modules with temporal convolution, a powerful framework named DS-GCN was developed for skeleton-based action recognition. Extensive experiments in two datasets, NTU-RGB+D and Kinetics-400 show that the proposed semantic modules were generalized enough to be utilized in various backbones for boosting recognition accuracy. Meanwhile, the proposed DS-GCN notably outperformed state-of-the-art methods. The code is released here https://github.com/davelailai/DS-GCNDownloads
Published
2024-03-24
How to Cite
Xie, J., Meng, Y., Zhao, Y., Nguyen, A., Yang, X., & Zheng, Y. (2024). Dynamic Semantic-Based Spatial Graph Convolution Network for Skeleton-Based Human Action Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 38(6), 6225-6233. https://doi.org/10.1609/aaai.v38i6.28440
Issue
Section
AAAI Technical Track on Computer Vision V