Language Matters In Twitter: A Large Scale Study
Despite the widespread adoption of Twitter internationally, little research has investigated the differences among users of different languages. In prior research, the natural tendency has been to assume that the behaviors of English users generalize to other language users. We studied 62 million tweets collected over a four-week period and found that more than 100 languages were used. Only half of the tweets were in English (51%). Other popular languages including Japanese, Portuguese, Indonesian, and Spanish together accounted for 39% of the tweets. Examining users of the top 10 languages, we discovered cross-language differences in adoption of features such as URLs, hashtags, mentions, replies, and retweets. We discuss our work’s implications for research on large-scale social systems and design of cross-cultural communication tools.