Optimizing LoRA Allocation of MoE with the Alignment of Topic Correlation

Hengyuan Xu; Wenjun Ke; Yao He; Jiajun Liu; Dong Nie; Peng Wang; Ziyu Shang; Zijie Xu

doi:10.1609/aaai.v40i32.39941

Authors

Hengyuan Xu College of Software Engineering, Southeast University, China
Wenjun Ke School of Computer Science and Engineering, Southeast University, China Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
Yao He Institute of Collaborative Innovation, University of Macau, Macau, China
Jiajun Liu School of Computer Science and Engineering, Southeast University, China
Dong Nie Meta Inc.
Peng Wang School of Computer Science and Engineering, Southeast University, China Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
Ziyu Shang School of Computer Science and Engineering, Southeast University, China
Zijie Xu School of Computer Science and Engineering, Southeast University, China

DOI:

https://doi.org/10.1609/aaai.v40i32.39941

Abstract

Mixture of experts (MoE) dynamically routes inputs to specialized expert networks to scale model capacity with low inference overhead. However, the excessive parameter growth in MoE models poses challenges in low-resource settings. To address these issues, MoE with parameter-efficient fine-tuning (PEFT) methods have emerged as a lightweight adaptation paradigm that distributes knowledge among experts via multiple LoRA blocks. Existing MoE-PEFT methods can be broadly categorized into External and Internal PEFT methods. External PEFT methods incorporate lightweight models into existing MoE architectures without modifying their routing, which limits the model’s parameter efficiency. To overcome these issues, Internal PEFT methods integrate MoE architectures into PEFT, enabling minimal parameter overhead. However, they still face two major challenges: (1) lack of expert functional differentiation, resulting in overlapping specialization across modules, and (2) absence of a structured attribution mechanism to guide expert selection based on semantic relevance. To alleviate these challenges, we propose TopicLoRA, a novel three-stage framework that leverages topic knowledge as semantic anchors to guide expert allocation. Specifically, (1) to address expert redundancy, we construct a topic-level prior graph using Graph Neural Network-enhanced representation learning over Big-Bench categories, enforcing structural separation among expert embeddings, and (2) to introduce semantic attribution, we design a dual-loss training mechanism that softly aligns input-query relevance with topic-guided routing distributions via KL divergence. Extensive experiments on representative datasets (e.g., MMLU, GSM8K, Flanv2) demonstrate that TopicLoRA outperforms state-of-the-art PEFT baselines by 2.40% on average in accuracy. Notably, the maximum improvement is 4.21%. Furthermore, ablation studies demonstrate that our framework's robustness to intricate topics and input sequence variations, which stems from the dual-loss training mechanism.

Optimizing LoRA Allocation of MoE with the Alignment of Topic Correlation

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information