LoRID: Low-Rank Iterative Diffusion for Adversarial Purification
DOI:
https://doi.org/10.1609/aaai.v39i21.34472Abstract
This work presents an information-theoretic examination of diffusion-based purification methods, the state-of-the-art adversarial defenses that utilize diffusion models to remove malicious perturbations in adversarial examples. By theoretically characterizing the inherent purification errors associated with the Markov-based diffusion purifications, we introduce LoRID, a novel Low-Rank Iterative Diffusion purification method designed to remove adversarial perturbation with low intrinsic purification errors. LoRID centers around a multi-stage purification process that leverages multiple rounds of diffusion-denoising loops at the early time-steps of the diffusion models, and the integration of Tucker decomposition, an extension of matrix factorization, to remove adversarial noise at high-noise regimes. Consequently, LoRID increases the effective diffusion time-steps and overcomes strong adversarial attacks, achieving superior robustness performance in CIFAR-10/100, CelebA-HQ, and ImageNet datasets under both white-box and grey-box settings.Downloads
Published
2025-04-11
How to Cite
Zollicoffer, G., Vu, M. N., Nebgen, B., Castorena, J., Alexandrov, B., & Bhattarai, M. (2025). LoRID: Low-Rank Iterative Diffusion for Adversarial Purification. Proceedings of the AAAI Conference on Artificial Intelligence, 39(21), 23081–23089. https://doi.org/10.1609/aaai.v39i21.34472
Issue
Section
AAAI Technical Track on Machine Learning VII