Feng, Aosong, et al. “Diffuser: Efficient Transformers With Multi-Hop Attention Diffusion for Long Sequences”. Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 11, June 2023, pp. 12772-80, doi:10.1609/aaai.v37i11.26502.