Gesture Annotation With a Visual Search Engine for Multimodal Communication Research

Sergiy Turchyn; Inés Olza Moreno; Cristóbal Pagán Cánovas; Francis Steen; Mark Turner; Javier Valenzuela; Soumya Ray

doi:10.1609/aaai.v32i1.11421

Authors

Sergiy Turchyn Case Western Reserve University
Inés Olza Moreno Institute for Culture and Society, University of Navarra
Cristóbal Pagán Cánovas Institute for Culture and Society, University of Navarra
Francis Steen University of California-Los Angeles
Mark Turner Case Western Reserve University
Javier Valenzuela University of Murcia
Soumya Ray Case Western Reserve University

DOI:

https://doi.org/10.1609/aaai.v32i1.11421

Keywords:

multimodal communication, gesture recognition, computer vision

Abstract

Human communication is multimodal and includes elements such as gesture and facial expression along with spoken language. Modern technology makes it feasible to capture all such aspects of communication in natural settings. As a result, similar to fields such as genetics, astronomy and neuroscience, scholars in areas such as linguistics and communication studies are on the verge of a data-driven revolution in their fields. These new approaches require analytical support from machine learning and artificial intelligence to develop tools to help process the vast data repositories. The Distributed Little Red Hen Lab project is an international team of interdisciplinary researchers building a large-scale infrastructure for data-driven multimodal communications research. In this paper, we describe a machine learning system developed to automatically annotate a large database of television program videos as part of this project. The annotations mark regions where people or speakers are on screen along with body part motions including head, hand and shoulder motion. We also annotate a specific class of gestures known as timeline gestures. An existing gesture annotation tool, ELAN, can be used with these annotations to quickly locate gestures of interest. Finally, we provide an update mechanism for the system based on human feedback. We empirically evaluate the accuracy of the system as well as present data from pilot human studies to show its effectiveness at aiding gesture scholars in their work.

Gesture Annotation With a Visual Search Engine for Multimodal Communication Research

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription