Dong, Y. (2026) “Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching”, Proceedings of the AAAI Conference on Artificial Intelligence, 40(25), pp. 20844–20851. doi: 10.1609/aaai.v40i25.39224.