Towards Multimodal Sentiment Analysis via Hierarchical Correlation Modeling with Semantic Distribution Constraints

Authors

  • Qinfu Xu Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China)
  • Yiwei Wei China University of Petroleum (Beijing) at Karamay
  • Chunlei Wu Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China)
  • Leiquan Wang Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China)
  • Shaozu Yuan JD AI Research
  • Jie Wu Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China)
  • Jing Lu Qingdao Institute of Software, College of Computer Science and Technology, China University of Petroleum (East China)
  • Hengyang Zhou China University of Petroleum (Beijing) at Karamay

DOI:

https://doi.org/10.1609/aaai.v39i20.35484

Abstract

Sentiment analysis is rapidly advancing by utilizing various data modalities (e.g., text, video, and audio). However, most existing techniques only learn the atomic-level features that reflect strong correlations, while ignoring more complex compositions in multimodal data. Moreover, they also neglected the incongruity in semantic distribution among modalities. In light of this, we introduce a novel Hierarchical Correlation Modeling Network (HCMNet), which enhances the multimodal sentiment analysis by exploring both the atomic-level correlations based on dynamic attention reasoning and the composition-level correlations through topological graph reasoning. In addition, we also alleviate the impact of distributional inconsistencies between modalities from both atomic-level and composition-level perspectives. Specifically, we first design an atomic-level contrastive loss that constrains the semantic distribution across modalities to mitigate the atomic-level inconsistency. Then, we design a graph optimal transport module that integrates transport flows with different graphs to constrain the composition-level semantic distribution, thus reducing the inconsistency of compositional nodes. Experiments on three public benchmark datasets have demonstrated the superiority of the proposed model over the state-of-the-art methods.

Downloads

Published

2025-04-11

How to Cite

Xu, Q., Wei, Y., Wu, C., Wang, L., Yuan, S., Wu, J., … Zhou, H. (2025). Towards Multimodal Sentiment Analysis via Hierarchical Correlation Modeling with Semantic Distribution Constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 39(20), 21788–21796. https://doi.org/10.1609/aaai.v39i20.35484

Issue

Section

AAAI Technical Track on Machine Learning VI