L2-LoRA: Improving Low-Rank Adaptation with Layer-Specific Regularization

Authors

  • Xiang Zhang Peking University
  • Rui Xie Peking University
  • Shikun Zhang Peking University

DOI:

https://doi.org/10.1609/aaai.v40i41.40784

Abstract

Fine-tuning large language models (LLMs) in a parameter-efficient manner while preserving their pre-trained world knowledge remains a significant challenge. While Low-Rank Adaptation (LoRA) and its variants effectively mitigate catastrophic forgetting, they do not fully eliminate the loss of critical pre-trained knowledge. In this work, we first analyze the layer-wise distribution of domain-specific knowledge within LLMs through knowledge localization, and empirically identify a clear layer-specific pattern: pre-trained world knowledge predominantly resides in lower layers, whereas knowledge relevant to downstream tasks is more concentrated in higher layers. Motivated by this observation, we propose L2-LoRA, a simple yet effective variant of LoRA that applies layer-specific L2 regularization to the LoRA weights during fine-tuning. Specifically, L2-LoRA imposes stronger regularization on lower layers to preserve pre-trained world knowledge, while allowing greater adaptation in higher layers to better align with downstream tasks. Experiments across multiple benchmarks show that L2-LoRA not only consistently outperforms vanilla LoRA in downstream performance, but also effectively mitigates catastrophic forgetting by retaining more pre-trained knowledge.

Downloads

Published

2026-03-14

How to Cite

Zhang, X., Xie, R., & Zhang, S. (2026). L2-LoRA: Improving Low-Rank Adaptation with Layer-Specific Regularization. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 34817–34826. https://doi.org/10.1609/aaai.v40i41.40784

Issue

Section

AAAI Technical Track on Natural Language Processing VI