Homophily and Latent Attribute Inference: Inferring Latent Attributes of Twitter Users from Neighbors
In this paper, we extend existing work on latent attribute inference by leveraging the principle of homophily: we evaluate the inference accuracy gained by augmenting the user features with features derived from the Twitter profiles and postings of her friends. We consider three attributes which have varying degrees of assortativity: gender, age, and political affiliation. Our approach yields a significant and robust increase in accuracy for both age and political affiliation, indicating that our approach boosts performance for attributes with moderate to high assortativity. Furthermore, different neighborhood subsets yielded optimal performance for different attributes, suggesting that different subsamples of the user's neighborhood characterize different aspects of the user herself. Finally, inferences using only the features of a user's neighbors outperformed those based on the user's features alone. This suggests that the neighborhood context alone carries substantial information about the user.