[1]
Z. Qin, Z. He, N. Prakriya, J. Cong, and Y. Sun, “Dynamic-Width Speculative Beam Decoding for LLM Inference”, AAAI, vol. 39, no. 23, pp. 25056–25064, Apr. 2025.