DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors

23 May 2025

Abstract

Large-language-model (LLM) agents excel at reactive dialogue but struggle with proactive, goal-driven interactions due to myopic decoding and costly planning. We introduce DialogXpert, which leverages a frozen LLM to propose a small, high-quality set of candidate actions per turn and employs a compact Q-network over fixed BERT embeddings trained via temporal-difference learning to select optimal moves within this reduced space. By tracking the user's emotions, DialogXpert tailors each decision to advance the task while nurturing a genuine, empathetic connection. Across negotiation, emotional support, and tutoring benchmarks, DialogXpert drives conversations to under $3$ turns with success rates exceeding 94\% and, with a larger LLM prior, pushes success above 97\% while markedly improving negotiation outcomes. This framework delivers real-time, strategic, and emotionally intelligent dialogue planning at scale. Code available atthis https URL

View on arXiv

@article{rakib2025_2505.17795,
  title={ DialogXpert: Driving Intelligent and Emotion-Aware Conversations through Online Value-Based Reinforcement Learning with LLM Priors },
  author={ Tazeek Bin Abdur Rakib and Ambuj Mehrish and Lay-Ki Soon and Wern Han Lim and Soujanya Poria },
  journal={arXiv preprint arXiv:2505.17795},
  year={ 2025 }
}

Main:7 Pages

4 Figures

Bibliography:4 Pages

11 Tables

Appendix:24 Pages

Comments on this paper