(1)
Dong, Y.; Miao, Y.; Li, W.; Zheng, X.; Wang, C.; Wu, J.; Lyu, F. Accelerating LLM Inference Throughput via Asynchronous KV Cache Prefetching. AAAI 2026, 40, 20844-20851.