Feng, A., I. Li, Y. Jiang, and R. Ying. “Diffuser: Efficient Transformers With Multi-Hop Attention Diffusion for Long Sequences”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 11, June 2023, pp. 12772-80, doi:10.1609/aaai.v37i11.26502.