Light but Sharp: SlimSTAD for Real-Time Action Detection from Sensor Data

Authors

  • Wei Cui Institute for Infocomm Research (I2R), A*STAR
  • Lukai Fan Shandong University of Science and Technology
  • Zhenghua Chen University of Glasgow
  • Min Wu Institute for Infocomm Research (I2R), A*STAR
  • Shili Xiang Institute for Infocomm Research (I2R), A*STAR
  • Haixia Wang Shandong University of Science and Technology
  • Bing Li University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v40i1.36975

Abstract

Sensory Temporal Action Detection (STAD) aims to localize and classify human actions within long, untrimmed sequences captured by non-visual sensors such as WiFi or inertial measurement units (IMUs). Unlike video-based TAD, STAD poses unique challenges due to the low-dimensional, noisy, and heterogeneous nature of sensory data, as well as the real-time and resource constraints on edge devices. While recent STAD models have improved detection performance, their high computational cost hampers practical deployment. In this paper, we propose SlimSTAD, a simple yet effective framework that achieves both high accuracy and low latency for STAD. SlimSTAD features a novel Decoupled Channel Modeling (DCM) encoder, which preserves modality-specific temporal features and enables efficient inter-channel aggregation via lightweight graph attention. An anchor-free cascade predictor then refines action boundaries and class predictions in a two-stage design without dense proposals. Experiments on two real-world datasets demonstrate that SlimSTAD outperforms strong video-derived and sensory baselines by an average of 2.1 mAP, while significantly reducing GFLOPs, parameters, and latency, validating its effectiveness for real-world, edge-aware STAD deployment.

Downloads

Published

2026-03-14

How to Cite

Cui, W., Fan, L., Chen, Z., Wu, M., Xiang, S., Wang, H., & Li, B. (2026). Light but Sharp: SlimSTAD for Real-Time Action Detection from Sensor Data. Proceedings of the AAAI Conference on Artificial Intelligence, 40(1), 156–165. https://doi.org/10.1609/aaai.v40i1.36975

Issue

Section

AAAI Technical Track on Application Domains I