Exploring Social Context for Topic Identification in Short and Noisy Texts


  • Xin Wang Jilin University;Key Laboratory of Symbolic Computation and Knowledge Engineering, Ministry of Education; Changchun Institute of Tech
  • Ying Wang Jilin University
  • Wanli Zuo Jilin University
  • Guoyong Cai Guilin University of Electronic Technology




Topic Identification, Short and Noisy Texts, Preference Consistency, Social Contagion, Lasso


With the pervasion of social media, topic identification in short texts attracts increasing attention in  recent years. However, in nature the texts of social media are short and noisy, and the structures are sparse and dynamic, resulting in difficulty to identify topic categories exactly from online social media. Inspired by social science findings that preference consistency and social contagion are observed in social media, we investigate topic identification in short and noisy texts by exploring social context from the perspective of social sciences. In particular, we present a mathematical optimization formulation that incorporates the preference consistency and social contagion theories into a supervised learning method, and conduct feature selection to tackle short and noisy texts in social media, which result in a Sociological framework for Topic Identification (STI). Experimental results on real-world datasets from Twitter and Citation Network demonstrate the effectiveness of the proposed framework. Further experiments are conducted to understand the importance of social context in topic identification.




How to Cite

Wang, X., Wang, Y., Zuo, W., & Cai, G. (2015). Exploring Social Context for Topic Identification in Short and Noisy Texts. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9463



Main Track: Machine Learning Applications