Zhao, M., Hu, W., Wang, J., Lai, X., Huang, T., Min, Y., … Zhu, X. (2026). Making Every Head Count: Sparse Attention Without the Speed-Performance Trade-off. Proceedings of the AAAI Conference on Artificial Intelligence, 40(41), 34959–34967. https://doi.org/10.1609/aaai.v40i41.40800