Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation

Dongsheng Wang; Xu Jia; Yang Zhang; Xinyu Zhang; Yaoyuan Wang; Ziyang Zhang; Dong Wang; Huchuan Lu

doi:10.1609/aaai.v37i2.25346

Authors

Dongsheng Wang Dalian University of Technology
Xu Jia Dalian University of Technology
Yang Zhang Dalian University of Technology
Xinyu Zhang Dalian University of Technology
Yaoyuan Wang Huawei Technologies Co., Ltd.
Ziyang Zhang Huawei Technologies Co., Ltd.
Dong Wang Dalian University of Technology
Huchuan Lu Dalian University of Technology

DOI:

https://doi.org/10.1609/aaai.v37i2.25346

Keywords:

CV: Computational Photography, Image & Video Synthesis, CV: Vision for Robotics & Autonomous Driving

Abstract

Event-based cameras are bio-inspired sensors that capture brightness change of every pixel in an asynchronous manner. Compared with frame-based sensors, event cameras have microsecond-level latency and high dynamic range, hence showing great potential for object detection under high-speed motion and poor illumination conditions. Due to sparsity and asynchronism nature with event streams, most of existing approaches resort to hand-crafted methods to convert event data into 2D grid representation. However, they are sub-optimal in aggregating information from event stream for object detection. In this work, we propose to learn an event representation optimized for event-based object detection. Specifically, event streams are divided into grids in the x-y-t coordinates for both positive and negative polarity, producing a set of pillars as 3D tensor representation. To fully exploit information with event streams to detect objects, a dual-memory aggregation network (DMANet) is proposed to leverage both long and short memory along event streams to aggregate effective information for object detection. Long memory is encoded in the hidden state of adaptive convLSTMs while short memory is modeled by computing spatial-temporal correlation between event pillars at neighboring time intervals. Extensive experiments on the recently released event-based automotive detection dataset demonstrate the effectiveness of the proposed method.

Dual Memory Aggregation Network for Event-Based Object Detection with Learnable Representation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription