Feng, A., Li, I., Jiang, Y., & Ying, R. (2023). Diffuser: Efficient Transformers with Multi-Hop Attention Diffusion for Long Sequences. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 12772-12780. https://doi.org/10.1609/aaai.v37i11.26502