Hierarchical Estimation Framework of Multi-Label Classifying: A Case of Tweets Classifying into Real Life Aspects

Authors

  • Shuhei Yamamoto University of Tsukuba
  • Tetsuji Satoh University of Tsukuba

DOI:

https://doi.org/10.1609/icwsm.v9i1.14592

Abstract

Many people share their daily events and opinions on Twitter. Some are beneficial and comment on several aspects of a user’s real life, i.e., eating, traffic conditions, weather, and so on. Since some tweets indicate two or more aspects, multi-label classification is required. Typical methods are not performed on tweets because they consist of short and elided sentences. To conquer these problems, we are researching a hierarchical estimation framweork (HEF) to estimate several aspects of unknown tweets. HEF is composed of both unsupervised and supervised machine learnings. In the first phase, it extracts topics from a sea of tweets using latent dirichlet allocation (LDA). In the second phase, it calculates the relevance between topcis and aspects using a small set of labeled tweets to build associations among them. In this paper, we introduce the entropy feedback method in the second phase. We evaluate the Shannon entropy of each association between the aspects and topics and iteratively calculate the feedback coefficients by entropy to achieve optimal associations. Our sophisticated experimental evaluations with a large amount of actual tweets demonstrate the high efficiency of our multi-labeling method. Our entropy feedback method successfully increased higher F-measures in all aspects. Expecially in Disaster and Traffic aspects, precision greatly increased without decreasing recall.

Downloads

Published

2021-08-03

How to Cite

Yamamoto, S., & Satoh, T. (2021). Hierarchical Estimation Framework of Multi-Label Classifying: A Case of Tweets Classifying into Real Life Aspects. Proceedings of the International AAAI Conference on Web and Social Media, 9(1), 523-532. https://doi.org/10.1609/icwsm.v9i1.14592