Predicting Perceived Music Emotions with Respect to Instrument Combinations


  • Viet Dung Nguyen Rochester Institute of Technology
  • Quan H. Nguyen Gettysburg College
  • Richard G. Freedman SIFT



Music Emotion Recognition, Music Information Retrieval, Random Forest, Convolution Neural Network, Long Short-term Memory


Music Emotion Recognition has attracted a lot of academic research work in recent years because it has a wide range of applications, including song recommendation and music visualization. As music is a way for humans to express emotion, there is a need for a machine to automatically infer the perceived emotion of pieces of music. In this paper, we compare the accuracy difference between music emotion recognition models given music pieces as a whole versus music pieces separated by instruments. To compare the models' emotion predictions, which are distributions over valence and arousal values, we provide a metric that compares two distribution curves. Using this metric, we provide empirical evidence that training Random Forest and Convolution Recurrent Neural Network with mixed instrumental music data conveys a better understanding of emotion than training the same models with music that are separated into each instrumental source.




How to Cite

Nguyen, V. D., Nguyen, Q. H., & Freedman, R. G. (2024). Predicting Perceived Music Emotions with Respect to Instrument Combinations. Proceedings of the AAAI Conference on Artificial Intelligence, 37(13), 16078-16086.



EAAI Symposium: Human-Aware AI in Sound and Music