Dong, Y., Miao, Y., Li, W., Zheng, X., Wang, C., Wu, J., & Lyu, F. (2026). Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching. Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), 20844–20851. https://doi.org/10.1609/aaai.v40i25.39224