(1)
Wang, C.; He, H.; Zhao, L.; Deng, X.; Duan, L.; Wan, S. CasMoE: A Cascaded Framework for Efficient MoE Inference on Resource-Constrained Devices. AAAI 2026, 40, 26133-26141.