[1]
Y. Dong, “Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching”, AAAI, vol. 40, no. 25, pp. 20844–20851, Mar. 2026.