ObjectAdv: Object-Level Unrestricted Adversarial Attacks via Diffusion Models

Shijie Zhao; Zhenyu Liang; Xing Yang; Haoqi Gao; Anjie Peng; Hui Zeng

doi:10.1609/aaai.v40i16.38325

Authors

Shijie Zhao Southwest University of Science and Technology, Mianyang 621010, China; Jianghuai Advanced Technology Center, Hefei 230037, China
Zhenyu Liang Advanced Laser Technology Laboratory of Anhui Province, College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China; Information Security Research Center, Hefei Comprehensive National Science Center, Hefei 230037, China
Xing Yang Advanced Laser Technology Laboratory of Anhui Province, College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China; Jianghuai Advanced Technology Center, Hefei 230037, China
Haoqi Gao Advanced Laser Technology Laboratory of Anhui Province, College of Electronic Engineering, National University of Defense Technology, Hefei 230037, China; Jianghuai Advanced Technology Center, Hefei 230037, China
Anjie Peng Southwest University of Science and Technology, Mianyang 621010, China; Jianghuai Advanced Technology Center, Hefei 230037, China
Hui Zeng Southwest University of Science and Technology, Mianyang 621010, China;

DOI:

https://doi.org/10.1609/aaai.v40i16.38325

Abstract

Unrestricted adversarial attacks aim to fool DNNs by generating effective yet photorealistic examples. However, previous methods usually rely on global perturbations to enhance attack performance, which inevitably introduces visual distortions. To reduce visual distortions in the background, we propose a diffusion-based framework that focuses on local perturbations to generate object-level unrestricted adversarial examples (ObjectAdv). Since the cross-attention maps of Stable Diffusion contain the object information, we directly leverage the attention maps to localize the semantic region of object where for attacking. Second, a prompt-switching strategy is proposed for both imperceptibility and attack capacity. Specifically, to preserve layout and object shape of clean image, a prompt of true category is used at early denoising steps. At the later steps, we propose a well-designed prompt to guide the diffusion model to generate transferable adversarial examples. This local attack may cause inconsistency between the perturbed object and the background in adversarial examples. An FFT-based edge smoother is utilized to ensure seamless blending of the edges. ObjectAdv achieves an average ASR of 99.2% in white-box test on the ImageNet-compatible dataset, and outperforms existing methods on defense performance (+5%) and image quality metrics, e.g., SSIM of 0.9140 (+0.1048) and FID of 25.63 (-19.27).

ObjectAdv: Object-Level Unrestricted Adversarial Attacks via Diffusion Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information