L2-LoRA: Improving Low-Rank Adaptation with Layer-Specific Regularization
DOI:
https://doi.org/10.1609/aaai.v40i41.40784Abstract
Fine-tuning large language models (LLMs) in a parameter-efficient manner while preserving their pre-trained world knowledge remains a significant challenge. While Low-Rank Adaptation (LoRA) and its variants effectively mitigate catastrophic forgetting, they do not fully eliminate the loss of critical pre-trained knowledge. In this work, we first analyze the layer-wise distribution of domain-specific knowledge within LLMs through knowledge localization, and empirically identify a clear layer-specific pattern: pre-trained world knowledge predominantly resides in lower layers, whereas knowledge relevant to downstream tasks is more concentrated in higher layers. Motivated by this observation, we propose L2-LoRA, a simple yet effective variant of LoRA that applies layer-specific L2 regularization to the LoRA weights during fine-tuning. Specifically, L2-LoRA imposes stronger regularization on lower layers to preserve pre-trained world knowledge, while allowing greater adaptation in higher layers to better align with downstream tasks. Experiments across multiple benchmarks show that L2-LoRA not only consistently outperforms vanilla LoRA in downstream performance, but also effectively mitigates catastrophic forgetting by retaining more pre-trained knowledge.Downloads
Published
2026-03-14
How to Cite
Zhang, X., Xie, R., & Zhang, S. (2026). L2-LoRA: Improving Low-Rank Adaptation with Layer-Specific Regularization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 34817–34826. https://doi.org/10.1609/aaai.v40i41.40784
Issue
Section
AAAI Technical Track on Natural Language Processing VI