Exploring Non-target Knowledge for Improving Ensemble Universal Adversarial Attacks

Authors

  • Juanjuan Weng Department of Artificial Intelligence, Xiamen University, China
  • Zhiming Luo Department of Artificial Intelligence, Xiamen University, China Fujian Key Laboratory of Big Data Application and Intellectualization for Tea Industry, Wuyi University, China
  • Zhun Zhong Department of Information Engineering and Computer Science, University of Trento, Italy
  • Dazhen Lin Department of Artificial Intelligence, Xiamen University, China
  • Shaozi Li Department of Artificial Intelligence, Xiamen University, China

DOI:

https://doi.org/10.1609/aaai.v37i3.25377

Keywords:

CV: Adversarial Attacks & Robustness

Abstract

The ensemble attack with average weights can be leveraged for increasing the transferability of universal adversarial perturbation (UAP) by training with multiple Convolutional Neural Networks (CNNs). However, after analyzing the Pearson Correlation Coefficients (PCCs) between the ensemble logits and individual logits of the crafted UAP trained by the ensemble attack, we find that one CNN plays a dominant role during the optimization. Consequently, this average weighted strategy will weaken the contributions of other CNNs and thus limit the transferability for other black-box CNNs. To deal with this bias issue, the primary attempt is to leverage the Kullback–Leibler (KL) divergence loss to encourage the joint contribution from different CNNs, which is still insufficient. After decoupling the KL loss into a target-class part and a non-target-class part, the main issue lies in that the non-target knowledge will be significantly suppressed due to the increasing logit of the target class. In this study, we simply adopt a KL loss that only considers the non-target classes for addressing the dominant bias issue. Besides, to further boost the transferability, we incorporate the min-max learning framework to self-adjust the ensemble weights for each CNN. Experiments results validate that considering the non-target KL loss can achieve superior transferability than the original KL loss by a large margin, and the min-max training can provide a mutual benefit in adversarial ensemble attacks. The source code is available at: https://github.com/WJJLL/ND-MM.

Downloads

Published

2023-06-26

How to Cite

Weng, J., Luo, Z., Zhong, Z., Lin, D., & Li, S. (2023). Exploring Non-target Knowledge for Improving Ensemble Universal Adversarial Attacks. Proceedings of the AAAI Conference on Artificial Intelligence, 37(3), 2768-2775. https://doi.org/10.1609/aaai.v37i3.25377

Issue

Section

AAAI Technical Track on Computer Vision III