C-GNN-PRUNE: A Unified Graph-Based Framework for Structure-Aware Pruning of Mixture-of-Experts Models

Lin Li; Yan Wang; Zhuopeng Wang

doi:10.1609/aaai.v40i27.39462

Authors

Lin Li Inner Mongolia University
Yan Wang Inner Mongolia University
Zhuopeng Wang Inner Mongolia University

DOI:

https://doi.org/10.1609/aaai.v40i27.39462

Abstract

The Mixture-of-Experts (MoE) architecture has emerged as a promising paradigm for scaling large language models (LLMs) by activating only a sparse subset of experts per input. However, its massive parameter size remains a major obstacle to efficient deployment. Existing pruning methods often ignore two key aspects: the intricate structural dependencies among experts and the heterogeneous importance of different layers. To tackle these issues, we propose C-GNN-PRUNE, a unified and structure-aware compression framework tailored for MoE models. Our method introduces an EntropyGuided Allocation Module that dynamically assigns pruning budgets by leveraging expert activation entropy, enabling adaptive handling of inter-layer heterogeneity. To preserve structural collaboration patterns, we construct an expert interaction graph that fuses functional similarity and routing behavior, and employ a GNN-Based Embedding Module to learn structure-aware expert representations. These embeddings, along with co-activation patterns, are fed into a Community Detection Module to identify expert clusters for structured pruning. Finally, an Activation-Aware Selection Module retains the most critical experts in each community, balancing sparsity and expressiveness. Experiments on multiple open-source MoE models demonstrate that C-GNN-PRUNE consistently outperforms prior methods under various pruning ratios, achieving better trade-offs between compression and accuracy. This framework provides a modular and effective solution for structure-preserving compression of large-scale MoE models.

C-GNN-PRUNE: A Unified Graph-Based Framework for Structure-Aware Pruning of Mixture-of-Experts Models

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information