Two-Stream Convolution Augmented Transformer for Human Activity Recognition

Bing Li; Wei Cui; Wei Wang; Le Zhang; Zhenghua Chen; Min Wu

doi:10.1609/aaai.v35i1.16103

Authors

Bing Li University of New South Wales, Australia
Wei Cui Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore
Wei Wang Dongguan University of Technology, China University of New South Wales, Australia
Le Zhang Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore
Zhenghua Chen Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore
Min Wu Institute for Infocomm Research, Agency for Science, Technology and Research (ASTAR), Singapore

DOI:

https://doi.org/10.1609/aaai.v35i1.16103

Keywords:

Internet of Things, Sensor Networks & Smart Cities

Abstract

Recognition of human activities is an important task due to its far-reaching applications such as healthcare system, context-aware applications, and security monitoring. Recently, WiFi based human activity recognition (HAR) is becoming ubiquitous due to its non-invasiveness. Existing WiFi-based HAR methods regard WiFi signals as a temporal sequence of channel state information (CSI), and employ deep sequential models (e.g., RNN, LSTM) to automatically capture channel-over-time features. Although being remarkably effective, they suffer from two major drawbacks. Firstly, the granularity of a single temporal point is blindly elementary for representing meaningful CSI patterns. Secondly, the time-over-channel features are also important, and could be a natural data augmentation. To address the drawbacks, we propose a novel Two-stream Convolution Augmented Human Activity Transformer (THAT) model. Our model proposes to utilize a two-stream structure to capture both time-over-channel and channel-over-time features, and use the multi-scale convolution augmented transformer to capture range-based patterns. Extensive experiments on four real experiment datasets demonstrate that our model outperforms state-of-the-art models in terms of both effectiveness and efficiency.

Two-Stream Convolution Augmented Transformer for Human Activity Recognition

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription