Classifying Political Orientation on Twitter: It’s Not Easy!

Raviv Cohen; Derek Ruths

doi:10.1609/icwsm.v7i1.14434

Authors

Raviv Cohen McGill University
Derek Ruths McGill University

DOI:

https://doi.org/10.1609/icwsm.v7i1.14434

Keywords:

latent attribute inference, twitter, political orientation

Abstract

Numerous papers have reported great success at inferring the political orientation of Twitter users. This paper has some unfortunate news to deliver: while past work has been sound and often methodologically novel, we have discovered that reported accuracies have been systemically overoptimistic due to the way in which validation datasets have been collected, reporting accuracy levels nearly 30% higher than can be expected in populations of general Twitter users. Using careful and novel data collection and annotation techniques, we collected three different sets of Twitter users, each characterizing a different degree of political engagement on Twitter - from politicians (highly politically vocal) to "normal" users (those who rarely discuss politics). Applying standard techniques for inferring political orientation, we show that methods which previously reported greater than 90% inference accuracy, actually achieve barely 65% accuracy on normal users. We also show that classifiers cannot be used to classify users outside the narrow range of political orientation on which they were trained. While a sobering finding, our results quantify and call attention to overlooked problems in the latent attribute inference literature that, no doubt, extend beyond political orientation inference: the way in which datasets are assembled and the transferability of classifiers.

Classifying Political Orientation on Twitter: It’s Not Easy!

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information