CrisisMMD: Multimodal Twitter Datasets from Natural Disasters
DOI:
https://doi.org/10.1609/icwsm.v12i1.14983Keywords:
Multimodal, Twitter datasets, Textual and multimedia content, Natural disastersAbstract
During natural and man-made disasters, people use social media platforms such as Twitter to post textual and multimedia content to report updates about injured or dead people, infrastructure damage, missing or found people, among other information types. Studies have revealed that this online information, if processed timely and effectively, is extremely useful for humanitarian organizations to gain situational awareness and plan relief operations. In addition to the analysis of textual content, recent studies have shown that imagery content on social media can boost disaster response significantly. Despite extensive research that mainly focuses on textual content to extract useful information, limited work has focused on the use of imagery content or the combination of both content types. One of the reasons is the lack of labeled imagery data in this domain. Therefore, in this paper, we aim to tackle this limitation by releasing a large multi-modal dataset from natural disasters collected from Twitter. We provide three types of annotations, which are useful to address a number of crisis response and management tasks for different humanitarian organizations.