LoRID: Low-Rank Iterative Diffusion for Adversarial Purification

Geigh Zollicoffer; Minh N. Vu; Ben Nebgen; Juan Castorena; Boian Alexandrov; Manish Bhattarai

doi:10.1609/aaai.v39i21.34472

Authors

Geigh Zollicoffer Georgia Institute of Technology
Minh N. Vu Los Alamos National Laboratory
Ben Nebgen Los Alamos National Laboratory
Juan Castorena Los Alamos National Laboratory
Boian Alexandrov Los Alamos National Laboratory
Manish Bhattarai Los Alamos National Laboratory

DOI:

https://doi.org/10.1609/aaai.v39i21.34472

Abstract

This work presents an information-theoretic examination of diffusion-based purification methods, the state-of-the-art adversarial defenses that utilize diffusion models to remove malicious perturbations in adversarial examples. By theoretically characterizing the inherent purification errors associated with the Markov-based diffusion purifications, we introduce LoRID, a novel Low-Rank Iterative Diffusion purification method designed to remove adversarial perturbation with low intrinsic purification errors. LoRID centers around a multi-stage purification process that leverages multiple rounds of diffusion-denoising loops at the early time-steps of the diffusion models, and the integration of Tucker decomposition, an extension of matrix factorization, to remove adversarial noise at high-noise regimes. Consequently, LoRID increases the effective diffusion time-steps and overcomes strong adversarial attacks, achieving superior robustness performance in CIFAR-10/100, CelebA-HQ, and ImageNet datasets under both white-box and grey-box settings.

LoRID: Low-Rank Iterative Diffusion for Adversarial Purification

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information