Predicting Emotion Perception Across Domains: A Study of Singing and Speaking

Biqiao Zhang; Emily Mower Provost; Robert Swedberg; Georg Essl

doi:10.1609/aaai.v29i1.9334

Authors

Biqiao Zhang University of Michigan
Emily Mower Provost University of Michigan
Robert Swedberg University of Michigan
Georg Essl University of Michigan

DOI:

https://doi.org/10.1609/aaai.v29i1.9334

Keywords:

Emotion Perception, Emotion Modeling, Singing, Speaking

Abstract

Emotion affects our understanding of the opinions and sentiments of others. Research has demonstrated that humans are able to recognize emotions in various domains, including speech and music, and that there are potential shared features that shape the emotion in both domains. In this paper, we investigate acoustic and visual features that are relevant to emotion perception in the domains of singing and speaking. We train regression models using two paradigms: (1) within-domain, in which models are trained and tested on the same domain and (2) cross-domain, in which models are trained on one domain and tested on the other domain. This strategy allows us to analyze the similarities and differences underlying the relationship between audio-visual feature expression and emotion perception and how this relationship is affected by domain of expression. We use kernel density estimation to model emotion as a probability distribution over the perception associated with multiple evaluators on the valence-activation space. This allows us to model the variation inherent in the reported perception. Results suggest that activation can be modeled more accurately across domains, compared to valence. Furthermore, visual features capture cross-domain emotion more accurately than acoustic features. The results provide additional evidence for a shared mechanism underlying spoken and sung emotion perception.

Predicting Emotion Perception Across Domains: A Study of Singing and Speaking

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information