BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection

Authors

  • Yinhao Li Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, CAS;University of Chinese Academy of Sciences
  • Zheng Ge MEGVII Technology
  • Guanyi Yu MEGVII Technology
  • Jinrong Yang Huazhong University of Science and Technology
  • Zengran Wang MEGVII Technology
  • Yukang Shi Xi’an Jiaotong University
  • Jianjian Sun MEGVII Technology
  • Zeming Li MEGVII Technology

DOI:

https://doi.org/10.1609/aaai.v37i2.25233

Keywords:

CV: 3D Computer Vision, CV: Vision for Robotics & Autonomous Driving

Abstract

In this research, we propose a new 3D object detector with a trustworthy depth estimation, dubbed BEVDepth, for camera-based Bird's-Eye-View~(BEV) 3D object detection. Our work is based on a key observation -- depth estimation in recent approaches is surprisingly inadequate given the fact that depth is essential to camera 3D detection. Our BEVDepth resolves this by leveraging explicit depth supervision. A camera-awareness depth estimation module is also introduced to facilitate the depth predicting capability. Besides, we design a novel Depth Refinement Module to counter the side effects carried by imprecise feature unprojection. Aided by customized Efficient Voxel Pooling and multi-frame mechanism, BEVDepth achieves the new state-of-the-art 60.9% NDS on the challenging nuScenes test set while maintaining high efficiency. For the first time, the NDS score of a camera model reaches 60%. Codes have been released.

Downloads

Published

2023-06-26

How to Cite

Li, Y., Ge, Z., Yu, G., Yang, J., Wang, Z., Shi, Y., Sun, J., & Li, Z. (2023). BEVDepth: Acquisition of Reliable Depth for Multi-View 3D Object Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 37(2), 1477-1485. https://doi.org/10.1609/aaai.v37i2.25233

Issue

Section

AAAI Technical Track on Computer Vision II