[1]
J. Liu, “Balanced Knowledge Distillation for Large Language Models with Mix-of-Experts”, AAAI, vol. 40, no. 28, pp. 23694–23702, Mar. 2026.