Zhao, Tianyang, Kunwar Yashraj Singh, Srikar Appalaraju, Peng Tang, Vijay Mahadevan, R. Manmatha, and Ying Nian Wu. “No Head Left Behind – Multi-Head Alignment Distillation for Transformers”. Proceedings of the AAAI Conference on Artificial Intelligence 38, no. 7 (March 24, 2024): 7514–7524. Accessed May 26, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/28583.