A Semantic Interpreter for Social Media Handles

Authors

  • Azwad Anjum Islam Florida International University
  • Mark A. Finlayson Florida International University

DOI:

https://doi.org/10.1609/icwsm.v18i1.31343

Abstract

A handle is a short string of characters that identifies a user or account in a social media platform and is unique within the scope of the platform. Though usually of limited length, a handle can often be the most information-dense string in a social media user profile, potentially containing clues to the user’s name, age, location, demographics, or group affiliations. Despite this, the handle has been frequently set aside in work related to inferring user information from their social media profiles. We present a technique for semantic parsing of handles, which seeks to extract relevant information from the handle string. The technique is rule-based and relies on a set of tokenization rules and a variety of external databases (e.g., of names or places) to provide potential interpretations of handles in terms of names, locations, dates, indices, years, ages, positive/negative sentiments, and acronyms. We evaluate an implementation of the technique for English against existing corpora as well as manually evaluate parses of randomly sampled handles, showing that our method achieves good results in both tokenizing the handles (84.9% chance that the correct tokenization is in top three parses while 97% chance that one of the top three parses are at least reasonable) and providing overall “optimistic” interpretation performance of 90.1% accuracy and 0.89 F1. We also evaluate performance on each of the semantic aspects we interpret (name, location, index, year, age, sentiment, acronym). The technique not only allows us to extract additional information about a user from their handle but also allows us to measure trends in how handles are constructed on specific social media websites. We find that 59% of the handles in our data contain at least part of a person’s name, and over 69% of the handles are indicative of the user’s gender identity in some way. While our implementation targets English, it can be easily adapted to other languages given the appropriate databases. We release both our code and annotated evaluation data to aid other researchers in validating or extending our work.

Downloads

Published

2024-05-28

How to Cite

Islam, A. A., & Finlayson, M. A. (2024). A Semantic Interpreter for Social Media Handles. Proceedings of the International AAAI Conference on Web and Social Media, 18(1), 676-690. https://doi.org/10.1609/icwsm.v18i1.31343