Wang, Chengcheng, Haowen He, Liang Zhao, Xiaoheng Deng, Lixin Duan, and Shaohua Wan. 2026. “CasMoE: A Cascaded Framework for Efficient MoE Inference on Resource-Constrained Devices”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (31):26133-41. https://doi.org/10.1609/aaai.v40i31.39816.