Using Authorship Verification to Mitigate Abuse in Online Communities

Janith Weerasinghe; Rhia Singh; Rachel Greenstadt

doi:10.1609/icwsm.v16i1.19359

Authors

Janith Weerasinghe New York University
Rhia Singh Macaulay Honors College (Hunter CUNY)
Rachel Greenstadt New York University

DOI:

https://doi.org/10.1609/icwsm.v16i1.19359

Keywords:

Subjectivity in textual data; sentiment analysis; polarity/opinion identification and extraction, linguistic analyses of social media behavior

Abstract

Social media has become an important method for information sharing. This has also created opportunities for bad actors to easily spread disinformation and manipulate public opinion. This paper explores the possibility of applying Authorship Verification on online communities to mitigate abuse by analyzing the writing style of online accounts to identify accounts managed by the same person. We expand on our similarity-based authorship verification approach, previously applied on large fanfictions, and show that it works in open-world settings, shorter documents, and is largely topic-agnostic. Our expanded model can link Reddit accounts based on the writing style of only 40 comments with an AUC of 0.95, and the performance increases to 0.98 given more content. We apply this model on a set of suspicious Reddit accounts associated with the disinformation campaign surrounding the 2016 U.S. presidential election and show that the writing style of these accounts are inconsistent, indicating that each account was likely maintained by multiple individuals. We also apply this model to Reddit user accounts that commented on the WallStreetBets subreddit around the 2021 GameStop short squeeze and show that a number of account pairs share very similar writing styles. We also show that this approach can link accounts across Reddit and Twitter with an AUC of 0.91 even when training data is very limited.

Using Authorship Verification to Mitigate Abuse in Online Communities

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information