Zhao, Tianyang, et al. “No Head Left Behind – Multi-Head Alignment Distillation for Transformers”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 7, Mar. 2024, pp. 7514-2, doi:10.1609/aaai.v38i7.28583.