CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises

Authors

  • Alexandra Olteanu Ecole Polytechnique Federale de Lausanne
  • Carlos Castillo Qatar Computing Research Institute
  • Fernando Diaz Microsoft Research
  • Sarah Vieweg Qatar Computing Research Institute

DOI:

https://doi.org/10.1609/icwsm.v8i1.14538

Keywords:

Crisis Informatics, Microblogging, Lexicon Construction, Pseudo-Relevance Feedback, Adaptive Information Filtering

Abstract

Locating timely, useful information during crises and mass emergencies is critical for those forced to make potentially life-altering decisions. As the use of Twitter to broadcast useful information during such situations becomes more widespread, the problem of finding it becomes more difficult. We describe an approach toward improving the recall in the sampling of Twitter communications that can lead to greater situational awareness during crisis situations. First, we create a lexicon of crisis-related terms that frequently appear in relevant messages posted during different types of crisis situations. Next, we demonstrate how we use the lexicon to automatically identify new terms that describe a given crisis. Finally, we explain how to efficiently query Twitter to extract crisis-related messages during emergency events. In our experiments, using a crisis lexicon leads to substantial improvements in terms of recall when added to a set of crisis-specific keywords manually chosen by experts; it also helps to preserve the original distribution of message types.

Downloads

Published

2014-05-16

How to Cite

Olteanu, A., Castillo, C., Diaz, F., & Vieweg, S. (2014). CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises. Proceedings of the International AAAI Conference on Web and Social Media, 8(1), 376-385. https://doi.org/10.1609/icwsm.v8i1.14538