Using Social Media to Infer Gender Composition of Commuter Populations
DOI:
https://doi.org/10.1609/icwsm.v6i5.14223Keywords:
Twitter, demographic inference, urban populationsAbstract
In order for a municipality to effectively service and engage its constituency, it must understand the composition of the communities within it. Up to the present, such demographic estimates for target populations have been obtained largely from census data or expensive, time-intensive surveys. In this paper, we use Twitter microblog content to estimate the gender makeup of commuting populations using different modes of transportation (cars, public transportation, and bikes) in Toronto, Canada. We apply a demographic inference algorithm to 33,215 public Twitter accounts that follow one of three popular transportation-related Twitter-based news feeds (one for traffic, one for public transportation updates, and one for bicycling). Recent census data provides ground truth against which to compare the estimates we derive from Twitter. We find that, for all three communities (car drivers, public transport users, and bicyclists), the estimates obtained from Twitter reflect the majority-minority relationships between genders reported in census data. This provides preliminary, but compelling evidence that Twitter may be a platform that can go beyond simply signaling the presence of physical communities to actually measure their compositions.