CrisisLex: A Lexicon for Collecting and Filtering Microblogged Communications in Crises
DOI:
https://doi.org/10.1609/icwsm.v8i1.14538Keywords:
Crisis Informatics, Microblogging, Lexicon Construction, Pseudo-Relevance Feedback, Adaptive Information FilteringAbstract
Locating timely, useful information during crises and mass emergencies is critical for those forced to make potentially life-altering decisions. As the use of Twitter to broadcast useful information during such situations becomes more widespread, the problem of finding it becomes more difficult. We describe an approach toward improving the recall in the sampling of Twitter communications that can lead to greater situational awareness during crisis situations. First, we create a lexicon of crisis-related terms that frequently appear in relevant messages posted during different types of crisis situations. Next, we demonstrate how we use the lexicon to automatically identify new terms that describe a given crisis. Finally, we explain how to efficiently query Twitter to extract crisis-related messages during emergency events. In our experiments, using a crisis lexicon leads to substantial improvements in terms of recall when added to a set of crisis-specific keywords manually chosen by experts; it also helps to preserve the original distribution of message types.