DRF: Improving Certified Robustness via Distributional Robustness Framework

Authors

  • Zekai Wang School of Computer Science, Wuhan University
  • Zhengyu Zhou School of Computer Science, Wuhan University
  • Weiwei Liu School of Computer Science, Wuhan University

DOI:

https://doi.org/10.1609/aaai.v38i14.29504

Keywords:

ML: Adversarial Learning & Robustness, CV: Adversarial Attacks & Robustness, APP: Security

Abstract

Randomized smoothing (RS) has provided state-of-the-art (SOTA) certified robustness against adversarial perturbations for large neural networks. Among studies in this field, methods based on adversarial training (AT) achieve remarkably robust performance by applying adversarial examples to construct the smoothed classifier. These AT-based RS methods typically seek a pointwise adversary that generates the worst-case adversarial examples by perturbing each input independently. However, there are unexplored benefits to considering such adversarial robustness across the entire data distribution. To this end, we provide a novel framework called DRF, which connects AT-based RS methods with distributional robustness (DR), and show that these methods are special cases of their counterparts in our framework. Due to the advantages conferred by DR, our framework can control the trade-off between the clean accuracy and certified robustness of smoothed classifiers to a significant extent. Our experiments demonstrate that DRF can substantially improve the certified robustness of AT-based RS.

Published

2024-03-24

How to Cite

Wang, Z., Zhou, Z., & Liu, W. (2024). DRF: Improving Certified Robustness via Distributional Robustness Framework. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15752-15760. https://doi.org/10.1609/aaai.v38i14.29504

Issue

Section

AAAI Technical Track on Machine Learning V