A Stratified Learning Approach for Predicting the Popularity of Twitter Idioms
Keywords:Idioms, Popularity prediction, stratified learning
Twitter Idioms are one of the important types of hashtags that spread in Twitter. In this paper, we propose a classifier that can stratify the Idioms from the other kind of hashtags with 86.93% accuracy and high precision and recall rate. We then learn regression models on the stratified samples (Idioms and non-Idioms) separately to predict the popularity of the Idioms. This stratification not only itself allows us to make more accurate predictions but also makes it possible to include Idiom-specific features to separately improve the accuracy for the Idioms. Experimental results show that such stratification during the training phase followed by inclusion of Idiom-specific features leads to an overall improvement of 11.13% and 19.56% in correlation coefficient over the baseline method after the 7th and the 11th month respectively.