Zhao, M. (2026) “Making Every Head Count: Sparse Attention Without the Speed-Performance Trade-off”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), pp. 34959–34967. doi: 10.1609/aaai.v40i41.40800.