[1]
Z. Gao, “Multi-Branch Self-Drafting for LLM Inference Acceleration”, AAAI, vol. 39, no. 22, pp. 23942-23950, Apr. 2025.