ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs

Authors

  • Xunlei Chen School of Information and Software Engineering, University of Electronic Science and Technology of China
  • Jinyu Guo School of Information and Software Engineering, University of Electronic Science and Technology of China
  • Yuang Li School of Information and Software Engineering, University of Electronic Science and Technology of China
  • Zhaokun Wang School of Information and Software Engineering, University of Electronic Science and Technology of China
  • Yi Gong School of Information and Software Engineering, University of Electronic Science and Technology of China
  • Jie Zou School of Computer Science and Engineering, University of Electronic Science and Technology of China
  • Jiwei Wei School of Computer Science and Engineering, University of Electronic Science and Technology of China
  • Wenhong Tian School of Information and Software Engineering, University of Electronic Science and Technology of China

DOI:

https://doi.org/10.1609/aaai.v40i42.40845

Abstract

Large language models (LLMs) have advanced to encompass extensive knowledge across diverse domains. Yet controlling what a LLMs should not know is important for ensuring alignment and thus safe use. However, effective unlearning in LLMs is difficult due to the fuzzy boundary between knowledge retention and forgetting. This challenge is exacerbated by entangled parameter spaces from continuous multi-domain training, often resulting in collateral damage, especially under aggressive unlearning strategies. Furthermore, the computational overhead required to optimize State-of-the-Art (SOTA) models with billions of parameters poses an additional barrier. In this work, we present ALTER, a lightweight unlearning framework for LLMs to address both the challenges of knowledge entanglement and unlearning efficiency. ALTER operates through two phases: (I) high entropy tokens are captured and learned via the shared A matrix in LoRA, followed by (II) an asymmetric LoRA architecture that achieves a specified forgetting objective by parameter isolation and unlearning tokens within the target subdomains. Serving as a new research direction for achieving unlearning via token-level isolation in the asymmetric framework. ALTER achieves SOTA performance on TOFU, WMDP, and MUSE benchmarks with over 95% forget quality and shows minimal side effects through preserving foundational tokens. By decoupling unlearning from LLMs' billion-scale parameters, this framework delivers excellent efficiency while preserving over 90% of model utility, exceeding baseline preservation rates of 47.8-83.6%.

Downloads

Published

2026-03-14

How to Cite

Chen, X., Guo, J., Li, Y., Wang, Z., Gong, Y., Zou, J., … Tian, W. (2026). ALTER: Asymmetric LoRA for Token-Entropy-Guided Unlearning of LLMs. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 35366–35374. https://doi.org/10.1609/aaai.v40i42.40845

Issue

Section

AAAI Technical Track on Philosophy and Ethics of AI