[1]
A. Bulatov, Y. Kuratov, Y. Kapushev, and M. Burtsev, “Beyond Attention: Breaking the Limits of Transformer Context Length with Recurrent Memory”, AAAI, vol. 38, no. 16, pp. 17700–17708, Mar. 2024.