TY - JOUR AU - Zadeh, Amir AU - Liang, Paul Pu AU - Poria, Soujanya AU - Vij, Prateek AU - Cambria, Erik AU - Morency, Louis-Philippe PY - 2018/04/27 Y2 - 2024/03/28 TI - Multi-attention Recurrent Network for Human Communication Comprehension JF - Proceedings of the AAAI Conference on Artificial Intelligence JA - AAAI VL - 32 IS - 1 SE - Main Track: NLP and Machine Learning DO - 10.1609/aaai.v32i1.12024 UR - https://ojs.aaai.org/index.php/AAAI/article/view/12024 SP - AB - <p> <div class="page" title="Page 1"><div class="layoutArea"><div class="column"><p><span>Human face-to-face communication is a complex multimodal signal. We use words (language modality), gestures (vision modality) and changes in tone (acoustic modality) to convey our intentions. Humans easily process and understand face-to-face communication, however, comprehending this form of communication remains a significant challenge for Artificial Intelligence (AI). AI must understand each modality and the interactions between them that shape the communication. In this paper, we present a novel neural architecture for understanding human communication called the Multi-attention Recurrent Network (MARN). The main strength of our model comes from discovering interactions between modalities through time using a neural component called the Multi-attention Block (MAB) and storing them in the hybrid memory of a recurrent component called the Long-short Term Hybrid Memory (LSTHM). We perform extensive comparisons on six publicly available datasets for multimodal sentiment analysis, speaker trait recognition and emotion recognition. MARN shows state- of-the-art results performance in all the datasets. </span></p></div></div></div> </p> ER -