A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection

Authors

  • Fu Wang University of Exeter University of Liverpool
  • Yanghao Zhang University of Liverpool
  • Xiangyu Yin University of Liverpool
  • Guangliang Cheng University of Liverpool
  • Zeyu Fu University of Exeter
  • Xiaowei Huang University of Liverpool
  • Wenjie Ruan University of Exeter University of Liverpool

DOI:

https://doi.org/10.1609/aaai.v39i7.32822

Abstract

Camera-based Bird's Eye View (BEV) perception models receive increasing attention for their crucial role in autonomous driving, a domain where concerns about the robustness and reliability of deep learning have been raised. While only a few works have investigated the effects of randomly generated semantic perturbations, aka natural corruptions, on the multi-view BEV detection task, we develop a black-box robustness evaluation framework that adversarially optimises three common semantic perturbations: geometric transformation, colour shifting, and motion blur, to deceive BEV models, serving as the first approach in this emerging field. To address the challenge posed by optimising the semantic perturbation, we design a smoothed, distance-based surrogate function to replace the mAP metric and introduce SimpleDIRECT, a deterministic optimisation algorithm that utilises observed slopes to guide the optimisation process. By comparing with randomised perturbation and two optimisation baselines, we demonstrate the effectiveness of the proposed framework. Additionally, we provide a benchmark on the semantic robustness of ten recent BEV models. The results reveal that PolarFormer, which emphasises geometric information from multi-view images, exhibits the highest robustness, whereas BEVDet is fully compromised, with its precision reduced to zero.

Downloads

Published

2025-04-11

How to Cite

Wang, F., Zhang, Y., Yin, X., Cheng, G., Fu, Z., Huang, X., & Ruan, W. (2025). A Black-Box Evaluation Framework for Semantic Robustness in Bird’s Eye View Detection. Proceedings of the AAAI Conference on Artificial Intelligence, 39(7), 7637–7645. https://doi.org/10.1609/aaai.v39i7.32822

Issue

Section

AAAI Technical Track on Computer Vision VI