[1]
Z. Ding, G. Jiang, S. Zhang, L. Guo, and W. Lin, “How to Trade Off the Quantity and Capacity of Teacher Ensemble: Learning Categorical Distribution to Stochastically Employ a Teacher for Distillation”, AAAI, vol. 38, no. 16, pp. 17915-17923, Mar. 2024.