Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models

Authors

  • Yihao Huang Nanyang Technological University, Singapore
  • Felix Juefei-Xu New York University, USA
  • Qing Guo CFAR and IHPC, Agency for Science, Technology and Research (A*STAR), Singapore
  • Jie Zhang Nanyang Technological University, Singapore
  • Yutong Wu Nanyang Technological University, Singapore
  • Ming Hu Nanyang Technological University, Singapore
  • Tianlin Li Nanyang Technological University, Singapore
  • Geguang Pu East China Normal University, China
  • Yang Liu Nanyang Technological University, Singapore

DOI:

https://doi.org/10.1609/aaai.v38i19.30110

Keywords:

General

Abstract

Although recent personalization methods have democratized high-resolution image synthesis by enabling swift concept acquisition with minimal examples and lightweight computation, they also present an exploitable avenue for highly accessible backdoor attacks. This paper investigates a critical and unexplored aspect of text-to-image (T2I) diffusion models - their potential vulnerability to backdoor attacks via personalization. By studying the prompt processing of popular personalization methods (epitomized by Textual Inversion and DreamBooth), we have devised dedicated personalization-based backdoor attacks according to the different ways of dealing with unseen tokens and divide them into two families: nouveau-token and legacy-token backdoor attacks. In comparison to conventional backdoor attacks involving the fine-tuning of the entire text-to-image diffusion model, our proposed personalization-based backdoor attack method can facilitate more tailored, efficient, and few-shot attacks. Through comprehensive empirical study, we endorse the utilization of the nouveau-token backdoor attack due to its impressive effectiveness, stealthiness, and integrity, markedly outperforming the legacy-token backdoor attack.

Published

2024-03-24

How to Cite

Huang, Y., Juefei-Xu, F., Guo, Q., Zhang, J., Wu, Y., Hu, M., … Liu, Y. (2024). Personalization as a Shortcut for Few-Shot Backdoor Attack against Text-to-Image Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21169–21178. https://doi.org/10.1609/aaai.v38i19.30110

Issue

Section

AAAI Technical Track on Safe, Robust and Responsible AI Track