(1)
Qin, Z.; He, Z.; Prakriya, N.; Cong, J.; Sun, Y. Dynamic-Width Speculative Beam Decoding for LLM Inference. AAAI 2025, 39, 25056-25064.