CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics

Authors

  • Ruixin Mao University of Electronic Science and Technology of China
  • Aoyu Shen University of Electronic Science and Technology of China
  • Lin Tang University of Electronic Science and Technology of China
  • Jun Zhou University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v39i6.32649

Abstract

Event-based cameras feature high temporal resolution, wide dynamic range, and low power consumption, which are ideal for high-speed and low-light object detection. Spiking neural networks (SNNs) are promising for event-based object recognition and detection due to their spiking nature but lack efficient training methods, leading to gradient vanishing and high computational complexity, especially in deep SNNs. Additionally, existing SNN frameworks often fail to effectively handle multi-scale spatiotemporal features, leading to increased data redundancy and reduced accuracy. To address these issues, we propose CREST, a novel conjointly trained spike-driven framework to exploit spatiotemporal dynamics in event-based object detection. We introduce the conjoint learning rule to accelerate SNN learning and alleviate gradient vanishing. It also supports dual operation modes for efficient and flexible implementation on different hardware types. Additionally, CREST features a fully spike driven framework with a multi-scale spatiotemporal event integrator (MESTOR) and a spatiotemporal-IoU (ST-IoU) loss. Our approach achieves superior object recognition & detection performance and energy efficiency compared with state of-the-art SNN algorithms on three datasets, providing an efficient solution for event-based object detection algorithms suitable for SNN hardware implementation.

Downloads

Published

2025-04-11

How to Cite

Mao, R., Shen, A., Tang, L., & Zhou, J. (2025). CREST: An Efficient Conjointly-trained Spike-driven Framework for Event-based Object Detection Exploiting Spatiotemporal Dynamics. Proceedings of the AAAI Conference on Artificial Intelligence, 39(6), 6072-6080. https://doi.org/10.1609/aaai.v39i6.32649

Issue

Section

AAAI Technical Track on Computer Vision V