Segmentation of Tweets with URLs and its Applications to Sentiment Analysis

Authors

  • Abdullah Aljebreen Temple University
  • Weiyi Meng Binghamton University
  • Eduard Dragut Temple University

Keywords:

Text Classification & Sentiment Analysis, Syntax -- Tagging, Chunking & Parsing, Information Extraction

Abstract

An important means for disseminating information in social media platforms is by including URLs that point to external sources in user posts. In Twitter, we estimate that about 21% of the daily stream of English-language tweets contain URLs. We notice that NLP tools make little attempt at understanding the relationship between the content of the URL and the text surrounding it in a tweet. In this work, we study the structure of tweets with URLs relative to the content of the Web documents pointed to by the URLs. We identify several segments classes that may appear in a tweet with URLs, such as the title of a Web page and the user's original content. Our goals in this paper are: introduce, define, and analyze the segmentation problem of tweets with URLs, develop an effective algorithm to solve it, and show that our solution can benefit sentiment analysis on Twitter. We also show that the problem is an instance of the block edit distance problem, and thus an NP-hard problem.

Downloads

Published

2021-05-18

How to Cite

Aljebreen, A., Meng, W., & Dragut, E. (2021). Segmentation of Tweets with URLs and its Applications to Sentiment Analysis. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12480-12488. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17480

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing I