Dai, Xiangxiang, Yuejin Xie, Maoli Liu, Xuchuang Wang, Zhuohua Li, Huanyu Wang, and John C.S. Lui. 2026. “A Multi-Agent Conversational Bandit Approach to Online Evaluation and Selection of User-Aligned LLM Responses”. Proceedings of the AAAI Conference on Artificial Intelligence 40 (44):37323-31. https://doi.org/10.1609/aaai.v40i44.41064.