Improving Microblog Retrieval from Exterior Corpus by Automatically Constructing Microblogging Corpus

Authors

  • Wenting Tu The University of Hong Kong
  • David Cheung The University of Hong Kong
  • Nikos Mamoulis The University of Hong Kong

DOI:

https://doi.org/10.1609/aaai.v29i1.9716

Keywords:

text classification, microblogging platform

Abstract

A large-scale training corpus consisting of microblogs belonging to a desired category is important for high-accuracy microblog retrieval. Obtaining such a large-scale microblgging corpus manually is very time and labor-consuming. Therefore, some models for the automatic retrieval of microblogs froman exterior corpus have been proposed. However, these approaches may fail in considering microblog-specific features. To alleviate this issue, we propose a methodology that constructs a simulated microblogging corpus rather than directly building a model from the exterior corpus. The performance of our model is better since the microblog-special knowledge of the microblogging corpus is used in the end by the retrieval model. Experimental results on real-world microblogs demonstrate the superiority of our technique compared to the previous approaches.

Downloads

Published

2015-03-04

How to Cite

Tu, W., Cheung, D., & Mamoulis, N. (2015). Improving Microblog Retrieval from Exterior Corpus by Automatically Constructing Microblogging Corpus. Proceedings of the AAAI Conference on Artificial Intelligence, 29(1). https://doi.org/10.1609/aaai.v29i1.9716