We Don't Speak the Same Language: Interpreting Polarization through Machine Translation

Authors

  • Ashiqur R. KhudaBukhsh Carnegie Mellon University
  • Rupak Sarkar Maulana Abul Kalam University of Technology
  • Mark S. Kamlet Carnegie Mellon University
  • Tom Mitchell Carnegie Mellon University

Keywords:

Computational Social Science, Web

Abstract

Polarization among US political parties, media and elites is a widely studied topic. Prominent lines of prior research across multiple disciplines have observed and analyzed growing polarization in social media. In this paper, we present a new methodology that offers a fresh perspective on interpreting polarization through the lens of machine translation. With a novel proposition that two sub-communities are speaking in two different "languages", we demonstrate that modern machine translation methods can provide a simple yet powerful and interpretable framework to understand the differences between two (or more) large-scale social media discussion data sets at the granularity of words. Via a substantial corpus of 86.6 million comments by 6.5 million users on over 200,000 news videos hosted by YouTube channels of four prominent US news networks, we demonstrate that simple word-level and phrase-level translation pairs can reveal deep insights into the current political divide -- what is "black lives matter" to one can be "all lives matter" to the other.

Downloads

Published

2021-05-18

How to Cite

R. KhudaBukhsh, A., Sarkar, R., Kamlet, M. S., & Mitchell, T. (2021). We Don’t Speak the Same Language: Interpreting Polarization through Machine Translation. Proceedings of the AAAI Conference on Artificial Intelligence, 35(17), 14893-14901. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17748

Issue

Section

AAAI Special Track on AI for Social Impact