DRF: Improving Certified Robustness via Distributional Robustness Framework
DOI:
https://doi.org/10.1609/aaai.v38i14.29504Keywords:
ML: Adversarial Learning & Robustness, CV: Adversarial Attacks & Robustness, APP: SecurityAbstract
Randomized smoothing (RS) has provided state-of-the-art (SOTA) certified robustness against adversarial perturbations for large neural networks. Among studies in this field, methods based on adversarial training (AT) achieve remarkably robust performance by applying adversarial examples to construct the smoothed classifier. These AT-based RS methods typically seek a pointwise adversary that generates the worst-case adversarial examples by perturbing each input independently. However, there are unexplored benefits to considering such adversarial robustness across the entire data distribution. To this end, we provide a novel framework called DRF, which connects AT-based RS methods with distributional robustness (DR), and show that these methods are special cases of their counterparts in our framework. Due to the advantages conferred by DR, our framework can control the trade-off between the clean accuracy and certified robustness of smoothed classifiers to a significant extent. Our experiments demonstrate that DRF can substantially improve the certified robustness of AT-based RS.Downloads
Published
2024-03-24
How to Cite
Wang, Z., Zhou, Z., & Liu, W. (2024). DRF: Improving Certified Robustness via Distributional Robustness Framework. Proceedings of the AAAI Conference on Artificial Intelligence, 38(14), 15752-15760. https://doi.org/10.1609/aaai.v38i14.29504
Issue
Section
AAAI Technical Track on Machine Learning V