Towards Multimodal Sentiment Analysis via Hierarchical Correlation Modeling with Semantic Distribution Constraints
DOI:
https://doi.org/10.1609/aaai.v39i20.35484Abstract
Sentiment analysis is rapidly advancing by utilizing various data modalities (e.g., text, video, and audio). However, most existing techniques only learn the atomic-level features that reflect strong correlations, while ignoring more complex compositions in multimodal data. Moreover, they also neglected the incongruity in semantic distribution among modalities. In light of this, we introduce a novel Hierarchical Correlation Modeling Network (HCMNet), which enhances the multimodal sentiment analysis by exploring both the atomic-level correlations based on dynamic attention reasoning and the composition-level correlations through topological graph reasoning. In addition, we also alleviate the impact of distributional inconsistencies between modalities from both atomic-level and composition-level perspectives. Specifically, we first design an atomic-level contrastive loss that constrains the semantic distribution across modalities to mitigate the atomic-level inconsistency. Then, we design a graph optimal transport module that integrates transport flows with different graphs to constrain the composition-level semantic distribution, thus reducing the inconsistency of compositional nodes. Experiments on three public benchmark datasets have demonstrated the superiority of the proposed model over the state-of-the-art methods.Downloads
Published
2025-04-11
How to Cite
Xu, Q., Wei, Y., Wu, C., Wang, L., Yuan, S., Wu, J., … Zhou, H. (2025). Towards Multimodal Sentiment Analysis via Hierarchical Correlation Modeling with Semantic Distribution Constraints. Proceedings of the AAAI Conference on Artificial Intelligence, 39(20), 21788–21796. https://doi.org/10.1609/aaai.v39i20.35484
Issue
Section
AAAI Technical Track on Machine Learning VI