Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path

Authors

  • Yuchen Ren Xi'an Jiaotong University
  • Zhengyu Zhao Xi'an Jiaotong University
  • Chenhao Lin Xi'an Jiaotong University
  • Bo Yang Information Engineering University
  • Lu Zhou Nanjing University of Aeronautics and Astronautics
  • Zhe Liu Zhejiang Lab
  • Chao Shen Xi'an Jiaotong University

DOI:

https://doi.org/10.1609/aaai.v39i7.32722

Abstract

Transferable adversarial examples are known to cause threats in practical, black-box attack scenarios. A notable approach to improving transferability is using integrated gradients (IG), originally developed for model interpretability. In this paper, we find that existing IG-based attacks have limited transferability due to their naive adoption of IG in model interpretability. To address this limitation, we focus on the IG integration path and refine it in three aspects: multiplicity, monotonicity, and diversity, supported by theoretical analyses. We propose the Multiple Monotonic Diversified Integrated Gradients (MuMoDIG) attack, which can generate highly transferable adversarial examples on different CNN and ViT models and defenses. Experiments validate that MuMoDIG outperforms the latest IG-based attack by up to 37.3% and other state-of-the-art attacks by 8.4%. In general, our study reveals that migrating established techniques to improve transferability may require non-trivial efforts.

Published

2025-04-11

How to Cite

Ren, Y., Zhao, Z., Lin, C., Yang, B., Zhou, L., Liu, Z., & Shen, C. (2025). Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path. Proceedings of the AAAI Conference on Artificial Intelligence, 39(7), 6731–6739. https://doi.org/10.1609/aaai.v39i7.32722

Issue

Section

AAAI Technical Track on Computer Vision VI