DialogXpert: Driving Intelligent and Emotion-Aware Conversations Through Online Value-Based Reinforcement Learning with LLM Priors

Tazeek Bin Abdur Rakib; Ambuj Mehrish; Lay-Ki Soon; Wern Han Lim; Soujanya Poria

doi:10.1609/aaai.v40i36.40244

DialogXpert: Driving Intelligent and Emotion-Aware Conversations Through Online Value-Based Reinforcement Learning with LLM Priors

Authors

Tazeek Bin Abdur Rakib Monash University, Malaysia
Ambuj Mehrish Singapore University of Technology and Design
Lay-Ki Soon Monash University, Malaysia
Wern Han Lim Monash University, Malaysia
Soujanya Poria Nanyang Technological University

DOI:

https://doi.org/10.1609/aaai.v40i36.40244

Abstract

Large-language-model (LLM) agents excel at reactive dialogue but struggle with proactive, goal-driven interactions due to myopic decoding and costly planning. We introduce DialogXpert, which leverages a frozen LLM to propose a small, high-quality set of candidate actions per turn and employs a compact Q-network over fixed BERT embeddings trained via temporal-difference learning to select optimal moves within this reduced space. By tracking the user's emotions DialogXpert tailors each decision to advance the task while nurturing a genuine, empathetic connection. Across negotiation, emotional support, and tutoring benchmarks, DialogXpert drives conversations to under 3 turns with success rates exceeding 94% and, with a larger LLM prior, pushes success above 97% while markedly improving negotiation outcomes. This framework delivers real-time, strategic, and emotionally intelligent dialogue planning at scale.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

Published

2026-03-14

How to Cite

Abdur Rakib, T. B., Mehrish, A., Soon, L.-K., Lim, W. H., & Poria, S. (2026). DialogXpert: Driving Intelligent and Emotion-Aware Conversations Through Online Value-Based Reinforcement Learning with LLM Priors. Proceedings of the AAAI Conference on Artificial Intelligence, 40(36), 29967–29975. https://doi.org/10.1609/aaai.v40i36.40244

Download Citation

Issue

Vol. 40 No. 36: AAAI-26 Technical Tracks 36

Section

AAAI Technical Track on Natural Language Processing I

DialogXpert: Driving Intelligent and Emotion-Aware Conversations Through Online Value-Based Reinforcement Learning with LLM Priors

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information