Unify Named Entity Recognition Scenarios via Contrastive Real-Time Updating Prototype

Authors

  • Yanhe Liu School of Computer Science and Engineering, Southeast University, Nanjing, China
  • Peng Wang School of Computer Science and Engineering, Southeast University, Nanjing, China Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
  • Wenjun Ke School of Computer Science and Engineering, Southeast University, Nanjing, China Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
  • Guozheng Li School of Computer Science and Engineering, Southeast University, Nanjing, China
  • Xiye Chen Nanjing University of Finance & Economics
  • Jiteng Zhao School of Computer Science and Engineering, Southeast University, Nanjing, China
  • Ziyu Shang School of Computer Science and Engineering, Southeast University, Nanjing, China

DOI:

https://doi.org/10.1609/aaai.v38i12.29312

Keywords:

ML: Life-Long and Continual Learning

Abstract

Supervised named entity recognition (NER) aims to classify entity mentions into a fixed number of pre-defined types. However, in real-world scenarios, unknown entity types are continually involved. Naive fine-tuning will result in catastrophic forgetting on old entity types. Existing continual methods usually depend on knowledge distillation to alleviate forgetting, which are less effective on long task sequences. Moreover, most of them are specific to the class-incremental scenario and cannot adapt to the online scenario, which is more common in practice. In this paper, we propose a unified framework called Contrastive Real-time Updating Prototype (CRUP) that can handle different scenarios for NER. Specifically, we train a Gaussian projection model by a regularized contrastive objective. After training on each batch, we store the mean vectors of representations belong to new entity types as their prototypes. Meanwhile, we update existing prototypes belong to old types only based on representations of the current batch. The final prototypes will be used for the nearest class mean classification. In this way, CRUP can handle different scenarios through its batch-wise learning. Moreover, CRUP can alleviate forgetting in continual scenarios only with current data instead of old data. To comprehensively evaluate CRUP, we construct extensive benchmarks based on various datasets. Experimental results show that CRUP significantly outperforms baselines in continual scenarios and is also competitive in the supervised scenario.

Published

2024-03-24

How to Cite

Liu, Y., Wang, P., Ke, W., Li, G., Chen, X., Zhao, J., & Shang, Z. (2024). Unify Named Entity Recognition Scenarios via Contrastive Real-Time Updating Prototype. Proceedings of the AAAI Conference on Artificial Intelligence, 38(12), 14035-14043. https://doi.org/10.1609/aaai.v38i12.29312

Issue

Section

AAAI Technical Track on Machine Learning III