Attention-based Belief or Disbelief Feature Extraction for Dependency Parsing
Existing neural dependency parsers usually encode each word in a sentence with bi-directional LSTMs, and estimate the score of an arc from the LSTM representations of the head and the modifier, possibly missing relevant context information for the arc being considered. In this study, we propose a neural feature extraction method that learns to extract arc-specific features. We apply a neural network-based attention method to collect evidences for and against each possible head-modifier pair, with which our model computes certainty scores of belief and disbelief, and determines the final arc score by subtracting the score of disbelief from the one of belief. By explicitly introducing two kinds of evidences, the arc candidates can compete against each other based on more relevant information, especially for the cases where they share the same head or modifier. It makes possible to better discriminate two or more competing arcs by presenting their rivals (disbelief evidence). Experiments on various datasets show that our arc-specific feature extraction mechanism significantly improves the performance of bi-directional LSTM-based models by explicitly modeling long-distance dependencies. For both English and Chinese, the proposed model achieve a higher accuracy on dependency parsing task than most existing neural attention-based models.