Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors

Authors

  • Fan Luo State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences (CASIA) School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
  • Zeyu Gao State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences (CASIA) School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
  • Xinhao Luo State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences (CASIA) School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China
  • Kai Zhao Institute of Automation, Chinese Academy of Sciences (CASIA)
  • Yanfeng Lu State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences (CASIA) School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing 100049, China

DOI:

https://doi.org/10.1609/aaai.v40i3.37178

Abstract

Spiking Neural Networks (SNNs), with their brain-inspired spatiotemporal dynamics and spike-driven computation, have emerged as promising energy-efficient alternatives to Artificial Neural Networks (ANNs). However, existing SNNs typically replicate inputs directly or aggregate them into frames at fixed intervals. Such strategies lead to neurons receiving nearly identical stimuli across time steps, severely limiting the model's expressive power—particularly in complex tasks like object detection. In this work, we propose the Temporal Dynamics Enhancer (TDE) to strengthen SNNs' capacity for temporal information modeling. TDE consists of two modules: a Spiking Encoder (SE) that generates diverse input stimuli across time steps, and an Attention Gating Module (AGM) that guides the SE generation based on inter-temporal dependencies. Moreover, to eliminate the high-energy multiplication operations introduced by the AGM, we propose a Spike-Driven Attention (SDA) to reduce attention-related energy consumption. Extensive experiments demonstrate that TDE can be seamlessly integrated into existing SNN-based detectors and consistently outperforms state-of-the-art methods, achieving mAP@50-95 scores of 57.7% on the static PASCAL VOC dataset and 47.6% on the neuromorphic EvDET200K dataset. In terms of energy consumption, the SDA consumes only 0.240× the energy of conventional attention modules.

Published

2026-03-14

How to Cite

Luo, F., Gao, Z., Luo, X., Zhao, K., & Lu, Y. (2026). Temporal Dynamics Enhancer for Directly Trained Spiking Object Detectors. Proceedings of the AAAI Conference on Artificial Intelligence, 40(3), 1973-1981. https://doi.org/10.1609/aaai.v40i3.37178

Issue

Section

AAAI Technical Track on Cognitive Modeling & Cognitive Systems