Improving Ensemble Robustness by Collaboratively Promoting and Demoting Adversarial Robustness

Authors

  • Anh Tuan Bui Monash University, Australia
  • Trung Le Monash University, Australia
  • He Zhao Monash University, Australia
  • Paul Montague Defence Science and Technology Group, Australia
  • Olivier deVel Defence Science and Technology Group, Australia
  • Tamas Abraham Defence Science and Technology Group, Australia
  • Dinh Phung Monash University, Australia

DOI:

https://doi.org/10.1609/aaai.v35i8.16843

Keywords:

Adversarial Learning & Robustness, Adversarial Attacks & Robustness, Safety, Robustness & Trustworthiness

Abstract

Ensemble-based Adversarial Training is a principled approach to achieve robustness against adversarial attacks. An important technicality of this approach is to control the transferability of adversarial examples between ensemble members. We propose in this work a simple, but effective strategy to collaborate among committee models of an ensemble model. This is achieved via the secure and insecure sets defined for each model member on a given sample, hence help us to quantify and regularize the transferability. Consequently, our proposed framework provides the flexibility to reduce the adversarial transferability as well as promote the diversity of ensemble members, which are two crucial factors for better robustness in our ensemble approach. We conduct extensive and comprehensive experiments to demonstrate that our proposed method outperforms the state-of-the-art ensemble baselines, at the same time can detect a wide range of adversarial examples with a near perfect accuracy.

Downloads

Published

2021-05-18

How to Cite

Bui, A. T., Le, T., Zhao, H., Montague, P., deVel, O., Abraham, T., & Phung, D. (2021). Improving Ensemble Robustness by Collaboratively Promoting and Demoting Adversarial Robustness. Proceedings of the AAAI Conference on Artificial Intelligence, 35(8), 6831-6839. https://doi.org/10.1609/aaai.v35i8.16843

Issue

Section

AAAI Technical Track on Machine Learning I