(1)
Dai, X.; Xie, Y.; Liu, M.; Wang, X.; Li, Z.; Wang, H.; Lui, J. C. A Multi-Agent Conversational Bandit Approach to Online Evaluation and Selection of User-Aligned LLM Responses. AAAI 2026, 40, 37323-37331.