Dong, Yanhao, Yubo Miao, Weinan Li, Xiao Zheng, Chao Wang, Jiesheng Wu, and Feng Lyu. “Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching”. Proceedings of the AAAI Conference on Artificial Intelligence 40, no. 25 (March 14, 2026): 20844–20851. Accessed May 13, 2026. https://ojs.aaai.org/index.php/AAAI/article/view/39224.