1.
Dong Y, Miao Y, Li W, Zheng X, Wang C, Wu J, et al. Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching. AAAI [Internet]. 2026 Mar. 14 [cited 2026 May 13];40(25):20844-51. Available from: https://ojs.aaai.org/index.php/AAAI/article/view/39224