Detecting Emotional Dynamic Trajectories: An Evaluation Framework for Emotional Support in Language Models

Authors

  • Zhouxing Tan National Engineering Research Center for Software Engineering, Peking University, Beijing, China
  • Ruochong Xiong National Engineering Research Center for Software Engineering, Peking University, Beijing, China
  • Yulong Wan National Engineering Research Center for Software Engineering, Peking University, Beijing, China
  • Jinlong Ma Guangzhou Quwan Network Technology, Guangzhou, China
  • Hanlin Xue National Engineering Research Center for Software Engineering, Peking University, Beijing, China
  • Qichun Deng Guangzhou Quwan Network Technology, Guangzhou, China
  • Haifeng Jing National Engineering Research Center for Software Engineering, Peking University, Beijing, China
  • Zhengtong Zhang Guangzhou Quwan Network Technology, Guangzhou, China
  • Depei Liu National Engineering Research Center for Software Engineering, Peking University, Beijing, China
  • Shiyuan Luo Guangzhou Quwan Network Technology, Guangzhou, China
  • Junfei Liu National Engineering Research Center for Software Engineering, Peking University, Beijing, China

DOI:

https://doi.org/10.1609/aaai.v40i3.37189

Abstract

Emotional support is a core capability in human-AI interaction, with applications including psychological counseling, role play, and companionship. However, existing evaluations of large language models (LLMs) often rely on short, static dialogues and fail to capture the dynamic and long-term nature of emotional support. To overcome this limitation, we shift from snapshot-based evaluation to trajectory-based assessment, adopting a user-centered perspective that evaluates models based on their ability to improve and stabilize user emotional states over time. Our framework constructs a large-scale benchmark consisting of 328 emotional contexts and 1,152 disturbance events, simulating realistic emotional shifts under evolving dialogue scenarios. To encourage psychologically grounded responses, we constrain model outputs using validated emotion regulation strategies such as situation selection and cognitive reappraisal. User emotional trajectories are modeled as a first-order Markov process, and we apply causally-adjusted emotion estimation to obtain unbiased emotional state tracking. Based on this framework, we introduce three trajectory-level metrics: Baseline Emotional Level (BEL), Emotional Trajectory Volatility (ETV), and Emotional Centroid Position (ECP). These metrics collectively capture user emotional dynamics over time and support comprehensive evaluation of long-term emotional support performance of LLMs. Extensive evaluations across a diverse set of LLMs reveal significant disparities in emotional support capabilities and provide actionable insights for model development.

Downloads

Published

2026-03-14

How to Cite

Tan, Z., Xiong, R., Wan, Y., Ma, J., Xue, H., Deng, Q., … Liu, J. (2026). Detecting Emotional Dynamic Trajectories: An Evaluation Framework for Emotional Support in Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(3), 2074–2082. https://doi.org/10.1609/aaai.v40i3.37189

Issue

Section

AAAI Technical Track on Cognitive Modeling & Cognitive Systems