Nested Named Entity Recognition with Partially-Observed TreeCRFs

Authors

  • Yao Fu The University of Edinburgh
  • Chuanqi Tan Alibaba Group
  • Mosha Chen Alibaba Group
  • Songfang Huang Alibaba Group
  • Fei Huang Alibaba Group

Keywords:

Information Extraction

Abstract

Named entity recognition (NER) is a well-studied task in natural language processing. However, the widely-used sequence labeling framework is difficult to detect entities with nested structures. In this work, we view nested NER as constituency parsing with partially-observed trees and model it with partially-observed TreeCRFs. Specifically, we view all labeled entity spans as observed nodes in a constituency tree, and other spans as latent nodes. With the TreeCRF we achieve a uniform way to jointly model the observed and the latent nodes. To compute the probability of partial trees with partial marginalization, we propose a variant of the Inside algorithm, the Masked Inside algorithm, that supports different inference operations for different nodes (evaluation for the observed, marginalization for the latent, and rejection for nodes incompatible with the observed) with efficient parallelized implementation, thus significantly speeding up training and inference. Experiments show that our approach achieves the state-of-the-art (SOTA) F1 scores on the ACE2004, ACE2005 dataset, and shows comparable performance to SOTA models on the GENIA dataset. We release the code at https://github.com/FranxYao/Partially-Observed-TreeCRFs.

Downloads

Published

2021-05-18

How to Cite

Fu, Y., Tan, C., Chen, M., Huang, S., & Huang, F. (2021). Nested Named Entity Recognition with Partially-Observed TreeCRFs. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12839-12847. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17519

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing I