[1]
T. Zhao, “No Head Left Behind – Multi-Head Alignment Distillation for Transformers”, AAAI, vol. 38, no. 7, pp. 7514–7524, Mar. 2024.