T-LoRA: Single Image Diffusion Model Customization Without Overfitting

Authors

  • Vera Soboleva FusionBrain Lab Higher School of Economics
  • Aibek Alanov FusionBrain Lab Higher School of Economics
  • Andrey Kuznetsov FusionBrain Lab Innopolis University
  • Konstantin Sobolev FusionBrain Lab Lomonosov Moscow State University

DOI:

https://doi.org/10.1609/aaai.v40i11.37861

Abstract

While diffusion model fine-tuning offers a powerful approach for customizing pre-trained models to generate specific objects, it frequently suffers from overfitting when training samples are limited, compromising both generalization capability and output diversity. This paper tackles the challenging yet most impactful task of adapting a diffusion model using just a single concept image, as single-image customization holds the greatest practical potential. We introduce T-LoRA, a Timestep-Dependent Low-Rank Adaptation framework specifically designed for diffusion model personalization. In our work we show that higher diffusion timesteps are more prone to overfitting than lower ones, necessitating a timestep-sensitive fine-tuning strategy. T-LoRA incorporates two key innovations: (1) a dynamic fine-tuning strategy that adjusts rank-constrained updates based on diffusion timesteps, and (2) a weight parametrization technique that ensures independence between adapter components through orthogonal initialization. Extensive experiments show that T-LoRA and its individual components outperform standard LoRA and other diffusion model personalization techniques. They achieve a superior balance between concept fidelity and text alignment, highlighting the potential of T-LoRA in data-limited and resource-constrained scenarios.

Published

2026-03-14

How to Cite

Soboleva, V., Alanov, A., Kuznetsov, A., & Sobolev, K. (2026). T-LoRA: Single Image Diffusion Model Customization Without Overfitting. Proceedings of the AAAI Conference on Artificial Intelligence, 40(11), 9051–9059. https://doi.org/10.1609/aaai.v40i11.37861

Issue

Section

AAAI Technical Track on Computer Vision VIII