Textual Analysis and Timely Detection of Suspended Social Media Accounts

Dominic Seyler; Shulong Tan; Dingcheng Li; Jingyuan Zhang; Ping Li

doi:10.1609/icwsm.v15i1.18091

Authors

Dominic Seyler Baidu Research
Shulong Tan Baidu Research
Dingcheng Li Baidu Research
Jingyuan Zhang Baidu Research
Ping Li Baidu Research

DOI:

https://doi.org/10.1609/icwsm.v15i1.18091

Keywords:

Subjectivity in textual data; sentiment analysis; polarity/opinion identification and extraction, linguistic analyses of social media behavior, Text categorization; topic recognition; demographic/gender/age identification, Qualitative and quantitative studies of social media, Credibility of online content

Abstract

Suspended accounts are high-risk accounts that violate the rules of a social network. These accounts contain spam, offensive and explicit language, among others, and are incredibly variable in terms of textual content. In this work, we perform a detailed linguistic and statistical analysis into the textual information of suspended accounts and show how insights from our study significantly improve a deep-learning-based detection framework. Moreover, we investigate the utility of advanced topic modeling for the automatic creation of word lists that can discriminate suspended from regular accounts. Since early detection of these high-risk accounts is crucial, we evaluate multiple state-of-the-art classification models along the temporal dimension by measuring the minimum amount of textual signal needed to perform reliable predictions. Further, we show that the best performing models are able to detect suspended accounts earlier than the social media platform.

Textual Analysis and Timely Detection of Suspended Social Media Accounts

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information