Adversarial Initialization with Universal Adversarial Perturbation: A New Approach to Fast Adversarial Training

Chao Pan; Qing Li; Xin Yao

doi:10.1609/aaai.v38i19.30147

Authors

Chao Pan Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen 518055, China Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China The Hong Kong Polytechnic University, Hong Kong, China
Qing Li The Hong Kong Polytechnic University, Hong Kong, China
Xin Yao Research Institute of Trustworthy Autonomous Systems, Southern University of Science and Technology, Shenzhen 518055, China Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China

DOI:

https://doi.org/10.1609/aaai.v38i19.30147

Keywords:

General

Abstract

Traditional adversarial training, while effective at improving machine learning model robustness, is computationally intensive. Fast Adversarial Training (FAT) addresses this by using a single-step attack to generate adversarial examples more efficiently. Nonetheless, FAT is susceptible to a phenomenon known as catastrophic overfitting, wherein the model's adversarial robustness abruptly collapses to zero during the training phase. To address this challenge, recent studies have suggested adopting adversarial initialization with Fast Gradient Sign Method Adversarial Training (FGSM-AT), which recycles adversarial perturbations from prior epochs by computing gradient momentum. However, our research has uncovered a flaw in this approach. Given that data augmentation is employed during the training phase, the samples in each epoch are not identical. Consequently, the method essentially yields not the adversarial perturbation of a singular sample, but rather the Universal Adversarial Perturbation (UAP) of a sample and its data augmentation. This insight has led us to explore the potential of using UAPs for adversarial initialization within the context of FGSM-AT. We have devised various strategies for adversarial initialization utilizing UAPs, including single, class-based, and feature-based UAPs. Experiments conducted on three distinct datasets demonstrate that our method achieves an improved trade-off among robustness, computational cost, and memory footprint. Code is available at https://github.com/fzjcdt/fgsm-uap.

Adversarial Initialization with Universal Adversarial Perturbation: A New Approach to Fast Adversarial Training

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information