Enhancing Semantic Role Labeling for Tweets Using Self-Training

Authors

  • Xiaohua Liu Harbin Institute of Technology and Microsoft Research Asia
  • Li Kuan Chongqing University
  • Ming Zhou Microsoft Research Asia
  • Zhongyang Xiong Chongqing University

Abstract

Semantic Role Labeling (SRL) for tweets is a meaningful task that can benefit a wide range of applications such as fine-grained information extraction and retrieval from tweets. One main challenge of the task is the lack of annotated tweets, which is required to train a statistical model. We introduce self-training to SRL, leveraging abundant unlabeled tweets to alleviate its depending on annotated tweets. A novel strategy of tweet selection is presented, ensuring the chosen tweets are both correct and informative. More specifically, the correctness is estimated according to the labeling confidences and agreement of two Conditional Random Fields based labelers, which are trained on the randomly evenly spitted labeled data; while the informativeness is in proportion to the maximum distance between the tweet and the already selected tweets. We evaluate our method on a human annotated data set and show that bootstrapping improve a baseline by 3.4% F1.

Downloads

Published

2011-08-04

How to Cite

Liu, X., Kuan, L., Zhou, M., & Xiong, Z. (2011). Enhancing Semantic Role Labeling for Tweets Using Self-Training. Proceedings of the AAAI Conference on Artificial Intelligence, 25(1), 896-901. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/7965

Issue

Section

AAAI Technical Track: Natural Language Processing