Optimizing LoRA Allocation of MoE with the Alignment of Topic Correlation

Authors

  • Hengyuan Xu College of Software Engineering, Southeast University, China
  • Wenjun Ke School of Computer Science and Engineering, Southeast University, China Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
  • Yao He Institute of Collaborative Innovation, University of Macau, Macau, China
  • Jiajun Liu School of Computer Science and Engineering, Southeast University, China
  • Dong Nie Meta Inc.
  • Peng Wang School of Computer Science and Engineering, Southeast University, China Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
  • Ziyu Shang School of Computer Science and Engineering, Southeast University, China
  • Zijie Xu School of Computer Science and Engineering, Southeast University, China

DOI:

https://doi.org/10.1609/aaai.v40i32.39941

Abstract

Mixture of experts (MoE) dynamically routes inputs to specialized expert networks to scale model capacity with low inference overhead. However, the excessive parameter growth in MoE models poses challenges in low-resource settings. To address these issues, MoE with parameter-efficient fine-tuning (PEFT) methods have emerged as a lightweight adaptation paradigm that distributes knowledge among experts via multiple LoRA blocks. Existing MoE-PEFT methods can be broadly categorized into External and Internal PEFT methods. External PEFT methods incorporate lightweight models into existing MoE architectures without modifying their routing, which limits the model’s parameter efficiency. To overcome these issues, Internal PEFT methods integrate MoE architectures into PEFT, enabling minimal parameter overhead. However, they still face two major challenges: (1) lack of expert functional differentiation, resulting in overlapping specialization across modules, and (2) absence of a structured attribution mechanism to guide expert selection based on semantic relevance. To alleviate these challenges, we propose TopicLoRA, a novel three-stage framework that leverages topic knowledge as semantic anchors to guide expert allocation. Specifically, (1) to address expert redundancy, we construct a topic-level prior graph using Graph Neural Network-enhanced representation learning over Big-Bench categories, enforcing structural separation among expert embeddings, and (2) to introduce semantic attribution, we design a dual-loss training mechanism that softly aligns input-query relevance with topic-guided routing distributions via KL divergence. Extensive experiments on representative datasets (e.g., MMLU, GSM8K, Flanv2) demonstrate that TopicLoRA outperforms state-of-the-art PEFT baselines by 2.40% on average in accuracy. Notably, the maximum improvement is 4.21%. Furthermore, ablation studies demonstrate that our framework's robustness to intricate topics and input sequence variations, which stems from the dual-loss training mechanism.

Downloads

Published

2026-03-14

How to Cite

Xu, H., Ke, W., He, Y., Liu, J., Nie, D., Wang, P., … Xu, Z. (2026). Optimizing LoRA Allocation of MoE with the Alignment of Topic Correlation. Proceedings of the AAAI Conference on Artificial Intelligence, 40(32), 27251–27259. https://doi.org/10.1609/aaai.v40i32.39941

Issue

Section

AAAI Technical Track on Machine Learning IX