Ding, Z., G. Jiang, S. Zhang, L. Guo, and W. Lin. “How to Trade Off the Quantity and Capacity of Teacher Ensemble: Learning Categorical Distribution to Stochastically Employ a Teacher for Distillation”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 16, Mar. 2024, pp. 17915-23, doi:10.1609/aaai.v38i16.29746.