Detecting Divisive Language: A Concept-Grounded, LLM-Guided Pipeline for Polarizing Social Media Sphere
DOI:
https://doi.org/10.1609/icwsm.v20i1.42796Abstract
Political polarization poses a growing global challenge, yet existing NLP approaches typically rely on indirect proxies such as toxicity or sentiment, which fail to capture identity-based antagonism that is central to polarizing discourse. We address this gap by conceptualizing polarization-related discourse as divisive language: language that explains political or social disagreement by attributing it to group-based identities. Building on this definition, we propose a staged training pipeline that uses large language models (LLMs) to generate definition-grounded supervision and progressively distills it into lightweight classifiers suitable for large-scale analysis. Experiments on social media data show that the resulting models substantially outperform zero-shot prompting and small-scale supervised baselines, while detecting forms ofpolarization that are not captured by toxicity- or sentiment-based methods. Our findings demonstrate that divisive language can be treated as a distinct, computable linguistic construct, enabling scalable and theoretically grounded analysis of political polarization.Downloads
Published
2026-05-25
How to Cite
He, Y., Li, T., Wang, J., & Zhang, Y. (2026). Detecting Divisive Language: A Concept-Grounded, LLM-Guided Pipeline for Polarizing Social Media Sphere. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 2980–2986. https://doi.org/10.1609/icwsm.v20i1.42796
Issue
Section
Poster Papers