Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models

Yihao Huang; Felix Juefei-Xu; Qing Guo; Jie Zhang; Yutong Wu; Ming Hu; Tianlin Li; Geguang Pu; Yang Liu

doi:10.1609/aaai.v38i19.30110

Authors

Yihao Huang Nanyang Technological University, Singapore
Felix Juefei-Xu New York University, USA
Qing Guo CFAR and IHPC, Agency for Science, Technology and Research (A*STAR), Singapore
Jie Zhang Nanyang Technological University, Singapore
Yutong Wu Nanyang Technological University, Singapore
Ming Hu Nanyang Technological University, Singapore
Tianlin Li Nanyang Technological University, Singapore
Geguang Pu East China Normal University, China
Yang Liu Nanyang Technological University, Singapore

DOI:

https://doi.org/10.1609/aaai.v38i19.30110

Keywords:

General

Abstract

Although recent personalization methods have democratized high-resolution image synthesis by enabling swift concept acquisition with minimal examples and lightweight computation, they also present an exploitable avenue for highly accessible backdoor attacks. This paper investigates a critical and unexplored aspect of text-to-image (T2I) diffusion models - their potential vulnerability to backdoor attacks via personalization. By studying the prompt processing of popular personalization methods (epitomized by Textual Inversion and DreamBooth), we have devised dedicated personalization-based backdoor attacks according to the different ways of dealing with unseen tokens and divide them into two families: nouveau-token and legacy-token backdoor attacks. In comparison to conventional backdoor attacks involving the fine-tuning of the entire text-to-image diffusion model, our proposed personalization-based backdoor attack method can facilitate more tailored, efficient, and few-shot attacks. Through comprehensive empirical study, we endorse the utilization of the nouveau-token backdoor attack due to its impressive effectiveness, stealthiness, and integrity, markedly outperforming the legacy-token backdoor attack.

Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information