Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models
DOI:
https://doi.org/10.1609/aaai.v38i19.30110Keywords:
GeneralAbstract
Although recent personalization methods have democratized high-resolution image synthesis by enabling swift concept acquisition with minimal examples and lightweight computation, they also present an exploitable avenue for highly accessible backdoor attacks. This paper investigates a critical and unexplored aspect of text-to-image (T2I) diffusion models - their potential vulnerability to backdoor attacks via personalization. By studying the prompt processing of popular personalization methods (epitomized by Textual Inversion and DreamBooth), we have devised dedicated personalization-based backdoor attacks according to the different ways of dealing with unseen tokens and divide them into two families: nouveau-token and legacy-token backdoor attacks. In comparison to conventional backdoor attacks involving the fine-tuning of the entire text-to-image diffusion model, our proposed personalization-based backdoor attack method can facilitate more tailored, efficient, and few-shot attacks. Through comprehensive empirical study, we endorse the utilization of the nouveau-token backdoor attack due to its impressive effectiveness, stealthiness, and integrity, markedly outperforming the legacy-token backdoor attack.Downloads
Published
2024-03-24
How to Cite
Huang, Y., Juefei-Xu, F., Guo, Q., Zhang, J., Wu, Y., Hu, M., … Liu, Y. (2024). Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21169–21178. https://doi.org/10.1609/aaai.v38i19.30110
Issue
Section
AAAI Technical Track on Safe, Robust and Responsible AI Track