CAO: A Fully Automatic Emoticon Analysis System

Authors

  • Michal Ptaszynski Hokkaido University
  • Jacek Maciejewski Hokkaido University
  • Pawel Dybala Hokkaido University
  • Rafal Rzepka Hokkaido University
  • Kenji Araki Hokkaido University

DOI:

https://doi.org/10.1609/aaai.v24i1.7715

Keywords:

Natural-Language Processing, Text Classification, Information Extraction, Human-Computer Interaction, Intelligent User Interfaces, Affect analysis, Emoticon, Facemark

Abstract

This paper presents CAO, a system for affect analysis of emoticons. Emoticons are strings of symbols widely used in text-based online communication to convey emotions. It extracts emoticons from input and determines specific emotions they express. Firstly, by matching the extracted emoticons to a raw emoticon database, containing over ten thousand emoticon samples extracted from the Web and annotated automatically. The emoticons for which emotion types could not be determined using only this database, are automatically divided into semantic areas representing "mouths" or "eyes," based on the theory of kinesics. The areas are automatically annotated according to their co-occurrence in the database. The annotation is firstly based on the eye-mouth-eye triplet, and if no such triplet is found, all semantic areas are estimated separately. This provides the system coverage exceeding 3 million possibilities. The evaluation, performed on both training and test sets, confirmed the system's capability to sufficiently detect and extract any emoticon, analyze its semantic structure and estimate the potential emotion types expressed. The system achieved nearly ideal scores, outperforming existing emoticon analysis systems.

Downloads

Published

2010-07-04

How to Cite

Ptaszynski, M., Maciejewski, J., Dybala, P., Rzepka, R., & Araki, K. (2010). CAO: A Fully Automatic Emoticon Analysis System. Proceedings of the AAAI Conference on Artificial Intelligence, 24(1), 1026-1032. https://doi.org/10.1609/aaai.v24i1.7715

Issue

Section

AAAI Technical Track: Natural Language Processing