SwiftPillars: High-Efficiency Pillar Encoder for Lidar-Based 3D Detection

Authors

  • Xin Jin Chang’an University SenseAuto Research
  • Kai Liu SenseAuto Research
  • Cong Ma SenseAuto Research
  • Ruining Yang Chang’an University SenseAuto Research
  • Fei Hui Chang’an University
  • Wei Wu SenseAuto Research Tsinghua University

DOI:

https://doi.org/10.1609/aaai.v38i3.28040

Keywords:

CV: Vision for Robotics & Autonomous Driving, CV: Applications

Abstract

Lidar-based 3D Detection is one of the significant components of Autonomous Driving. However, current methods over-focus on improving the performance of 3D Lidar perception, which causes the architecture of networks becoming complicated and hard to deploy. Thus, the methods are difficult to apply in Autonomous Driving for real-time processing. In this paper, we propose a high-efficiency network, SwiftPillars, which includes Swift Pillar Encoder (SPE) and Multi-scale Aggregation Decoder (MAD). The SPE is constructed by a concise Dual-attention Module with lightweight operators. The Dual-attention Module utilizes feature pooling, matrix multiplication, etc. to speed up point-wise and channel-wise attention extraction and fusion. The MAD interconnects multiple scale features extracted by SPE with minimal computational cost to leverage performance. In our experiments, our proposal accomplishes 61.3% NDS and 53.2% mAP in nuScenes dataset. In addition, we evaluate inference time on several platforms (P4, T4, A2, MLU370, RTX3080), where SwiftPillars achieves up to 13.3ms (75FPS) on NVIDIA Tesla T4. Compared with PointPillars, SwiftPillars is on average 26.58% faster in inference speed with equivalent GPUs and a higher mAP of approximately 3.2% in the nuScenes dataset.

Published

2024-03-24

How to Cite

Jin, X., Liu, K., Ma, C., Yang, R., Hui, F., & Wu, W. (2024). SwiftPillars: High-Efficiency Pillar Encoder for Lidar-Based 3D Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 38(3), 2625-2633. https://doi.org/10.1609/aaai.v38i3.28040

Issue

Section

AAAI Technical Track on Computer Vision II