TARA: Token-Aware LoRA for Composable Personalization in Diffusion Models
DOI:
https://doi.org/10.1609/aaai.v40i10.37788Abstract
Personalized text-to-image generation aims to synthesize novel images of a specific subject or style using only a few reference images. Recent methods based on Low-Rank Adaptation (LoRA) enable efficient single-concept customization by injecting lightweight, concept-specific adapters into pre-trained diffusion models. However, combining multiple LoRA modules for multi-concept generation often leads to identity missing and visual feature leakage. In this work, we identify two key issues behind these failures: (1) token-wise interference among different LoRA modules, and (2) spatial misalignment between the attention map of a rare token and its corresponding concept-specific region. To address these issues, we propose Token-Aware LoRA (TARA), which introduces a token mask to explicitly constrain each module to focus on its associated rare token to avoid interference, and a training objective that encourages the spatial attention of a rare token to align with its concept region. Our method enables training-free multi-concept composition by directly injecting multiple independently trained TARA modules at inference time. Experimental results demonstrate that TARA enables efficient multi-concept inference and effectively preserving the visual identity of each concept by avoiding mutual interference between LoRA modules.Published
2026-03-14
How to Cite
Peng, Y., Zheng, L., Yang, Y., Huang, Y., Yan, M., Liu, J., & Chen, S. (2026). TARA: Token-Aware LoRA for Composable Personalization in Diffusion Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(10), 8385–8393. https://doi.org/10.1609/aaai.v40i10.37788
Issue
Section
AAAI Technical Track on Computer Vision VII