Towards Holistic, Pragmatic and Multimodal Conversational Systems

Authors

  • Pranava Madhyastha City, University of London

DOI:

https://doi.org/10.1609/aaai.v38i20.30293

Keywords:

NLP, Multimodal, Vision, Machine Translation

Abstract

Language acquisition and utilization transcend the mere exchange of lexical units. Visual cues, prosody, gestures, body movements, and context play an undeniably crucial role. Humans naturally communicate multimodally, employing multiple channels and synthesizing information from diverse modalities. My research delves into the characterization and construction of multimodal models that seamlessly integrate data from multiple independent modalities. I will cover recent work that highlights the challenges, achievements, and opportunities towards developing capable multimodal discursive models.

Downloads

Published

2024-03-24

How to Cite

Madhyastha, P. (2024). Towards Holistic, Pragmatic and Multimodal Conversational Systems. Proceedings of the AAAI Conference on Artificial Intelligence, 38(20), 22677-22677. https://doi.org/10.1609/aaai.v38i20.30293