Continual Learning for Named Entity Recognition

Authors

  • Natawut Monaikul University of Illinois at Chicago
  • Giuseppe Castellucci Amazon
  • Simone Filice Amazon
  • Oleg Rokhlenko Amazon

DOI:

https://doi.org/10.1609/aaai.v35i15.17600

Keywords:

Information Extraction

Abstract

Named Entity Recognition (NER) is a vital task in various NLP applications. However, in many real-world scenarios (e.g., voice-enabled assistants) new named entities are frequently introduced, entailing re-training NER models to support these new entities. Re-annotating the original training data for the new entities could be costly or even impossible when storage limitations or security concerns restrict access to that data, and annotating a new dataset for all of the entities becomes impractical and error-prone as the number of entities increases. To tackle this problem, we introduce a novel Continual Learning approach for NER, which requires new training material to be annotated only for the new entities. To preserve the existing knowledge previously learned by the model, we exploit the Knowledge Distillation (KD) framework, where the existing NER model acts as the teacher for a new NER model (i.e., the student), which learns the new entity by using the new training material and retains knowledge of old entities by imitating the teacher's outputs on this new training set. Our experiments show that this approach allows the student model to ``progressively'' learn to identify new entities without forgetting the previously learned ones. We also present a comparison with multiple strong baselines to demonstrate that our approach is superior for continually updating an NER model.

Downloads

Published

2021-05-18

How to Cite

Monaikul, N., Castellucci, G., Filice, S., & Rokhlenko, O. (2021). Continual Learning for Named Entity Recognition. Proceedings of the AAAI Conference on Artificial Intelligence, 35(15), 13570-13577. https://doi.org/10.1609/aaai.v35i15.17600

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing II