Causal-ERC: A Multimodal Framework with Causal Prompting for Emotion Recognition in Conversations with Large Language Models

Authors

  • Ran Jing Harbin Institute of Technology
  • Geng Tu Harbin Institute of Technology
  • Yice Zhang Harbin Institute of Technology
  • Ruifeng Xu Harbin Institute of Technology Peng Cheng Laboratory Guangdong Provincial Key Laboratory of Novel Security Intelligence Technologies

DOI:

https://doi.org/10.1609/aaai.v40i37.40402

Abstract

The rapid advancement of large language models (LLMs) has revitalised research in Emotion Recognition in Conversation (ERC). However, existing LLM-based ERC approaches operate solely on textual input, whereas MLLM-based emotion recognition methods in non-conversational scenarios typically perform only basic multimodal fusion and fail to consider speaker-sensitive contextual dependencies, which limits their performance on ERC tasks. To integrate multimodal cues effectively and address their limitations in handling contextual dependencies, we propose a novel LLM-based framework, Causal-ERC, which captures context representations within each modality and incorporates them into the LLM. Moreover, experimental results show that LLMs perform poorly on long conversations. To improve LLMs' ability to model long conversations, we adjust corresponding causal prompts according to the causal type of each utterance. Experiments on two benchmark MERC datasets demonstrate that our Causal-ERC framework consistently outperforms existing state-of-the-art approaches and improves LLM's performance in long-context scenarios.

Downloads

Published

2026-03-14

How to Cite

Jing, R., Tu, G., Zhang, Y., & Xu, R. (2026). Causal-ERC: A Multimodal Framework with Causal Prompting for Emotion Recognition in Conversations with Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 40(37), 31383-31391. https://doi.org/10.1609/aaai.v40i37.40402

Issue

Section

AAAI Technical Track on Natural Language Processing II