(1)
Zhang, Z.; Shao, W.; Ge, Y.; Wang, X.; Gu, J.; Luo, P. Cached Transformers: Improving Transformers With Differentiable Memory Cachde. AAAI 2024, 38, 16935-16943.