[1]
P. Passban, Y. Wu, M. Rezagholizadeh, and Q. Liu, “ALP-KD: Attention-Based Layer Projection for Knowledge Distillation”, AAAI, vol. 35, no. 15, pp. 13657-13665, May 2021.