M-BEV: Masked BEV Perception for Robust Autonomous Driving

Authors

  • Siran Chen University of Chinese Academy of Science Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
  • Yue Ma Tsinghua University
  • Yu Qiao Shanghai AI Laboratory Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences
  • Yali Wang Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences Shanghai AI Laboratory

DOI:

https://doi.org/10.1609/aaai.v38i2.27880

Keywords:

CV: Vision for Robotics & Autonomous Driving, CV: 3D Computer Vision

Abstract

3D perception is a critical problem in autonomous driving. Recently, the Bird’s-Eye-View (BEV) approach has attracted extensive attention, due to low-cost deployment and desirable vision detection capacity. However, the existing models ignore a realistic scenario during the driving procedure, i.e., one or more view cameras may be failed, which largely deteriorates their performance. To tackle this problem, we propose a generic Masked BEV (M-BEV) perception framework, which can effectively improve robustness to this challenging scenario, by random masking and reconstructing camera views in the end-to-end training. More specifically, we develop a novel Masked View Reconstruction (MVR) module in our M-BEV. It mimics various missing cases by randomly masking features of different camera views, then leverages the original features of these views as self-supervision and reconstructs the masked ones with the distinct spatio-temporal context across camera views. Via such a plug-and-play MVR, our M-BEV is capable of learning the missing views from the resting ones, and thus well generalized for robust view recovery and accurate perception in the testing. We perform extensive experiments on the popular NuScenes benchmark, where our framework can significantly boost 3D perception performance of the state-of-the-art models on various missing view cases, e.g., for the absence of back view, our M-BEV promotes the PETRv2 model with 10.3% mAP gain.

Published

2024-03-24

How to Cite

Chen, S., Ma, Y., Qiao, Y., & Wang, Y. (2024). M-BEV: Masked BEV Perception for Robust Autonomous Driving. Proceedings of the AAAI Conference on Artificial Intelligence, 38(2), 1183–1191. https://doi.org/10.1609/aaai.v38i2.27880

Issue

Section

AAAI Technical Track on Computer Vision I