Combining Adversaries with Anti-adversaries in Training

Authors

  • Xiaoling Zhou Tianjin University, Tianjin
  • Nan Yang Tianjin University, Tianjin
  • Ou Wu Tianjin University, Tianjin

DOI:

https://doi.org/10.1609/aaai.v37i9.26352

Keywords:

ML: Adversarial Learning & Robustness, ML: Bias and Fairness, ML: Classification and Regression, ML: Deep Learning Theory, ML: Meta Learning

Abstract

Adversarial training is an effective learning technique to improve the robustness of deep neural networks. In this study, the influence of adversarial training on deep learning models in terms of fairness, robustness, and generalization is theoretically investigated under more general perturbation scope that different samples can have different perturbation directions (the adversarial and anti-adversarial directions) and varied perturbation bounds. Our theoretical explorations suggest that the combination of adversaries and anti-adversaries (samples with anti-adversarial perturbations) in training can be more effective in achieving better fairness between classes and a better tradeoff between robustness and generalization in some typical learning scenarios (e.g., noisy label learning and imbalance learning) compared with standard adversarial training. On the basis of our theoretical findings, a more general learning objective that combines adversaries and anti-adversaries with varied bounds on each training sample is presented. Meta learning is utilized to optimize the combination weights. Experiments on benchmark datasets under different learning scenarios verify our theoretical findings and the effectiveness of the proposed methodology.

Downloads

Published

2023-06-26

How to Cite

Zhou, X., Yang, N., & Wu, O. (2023). Combining Adversaries with Anti-adversaries in Training. Proceedings of the AAAI Conference on Artificial Intelligence, 37(9), 11435–11442. https://doi.org/10.1609/aaai.v37i9.26352

Issue

Section

AAAI Technical Track on Machine Learning IV