[1]
A. Feng, I. Li, Y. Jiang, and R. Ying, “Diffuser: Efficient Transformers with Multi-Hop Attention Diffusion for Long Sequences”, AAAI, vol. 37, no. 11, pp. 12772-12780, Jun. 2023.