ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context

Authors

  • Chenxiao Wu School of Computer Science and Engineering, Southeast University
  • Wenjun Ke School of Computer Science and Engineering, Southeast University Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
  • Peng Wang School of Computer Science and Engineering, Southeast University School of Cyber Science and Engineering, Southeast University Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications (Southeast University), Ministry of Education, China
  • Zhizhao Luo Beijing Institute of Computer Technology and Application
  • Guozheng Li School of Computer Science and Engineering, Southeast University
  • Wanyi Chen School of Cyber Science and Engineering, Southeast University

DOI:

https://doi.org/10.1609/aaai.v38i17.29892

Keywords:

NLP: Information Extraction, NLP: (Large) Language Models

Abstract

Named entity recognition (NER) aims to identify and classify specific entities mentioned in textual sentences. Most existing superior NER models employ the standard fully supervised paradigm, which requires a large amount of annotated data during training. In order to maintain performance with insufficient annotation resources (i.e., low resources), in-context learning (ICL) has drawn a lot of attention, due to its plug-and-play nature compared to other methods (e.g., meta-learning and prompt learning). In this manner, how to retrieve high-correlated demonstrations for target sentences serves as the key to emerging ICL ability. For the NER task, the correlation implies the consistency of both ontology (i.e., generalized entity type) and context (i.e., sentence semantic), which is ignored by previous NER demonstration retrieval techniques. To address this issue, we propose ConsistNER, a novel three-stage framework that incorporates ontological and contextual information for low-resource NER. Firstly, ConsistNER employs large language models (LLMs) to pre-recognize potential entities in a zero-shot manner. Secondly, ConsistNER retrieves the sentence-specific demonstrations for each target sentence based on the two following considerations: (1) Regarding ontological consistency, demonstrations are filtered into a candidate set based on ontology distribution. (2) Regarding contextual consistency, an entity-aware self-attention mechanism is introduced to focus more on the potential entities and semantic-correlated tokens. Finally, ConsistNER feeds the retrieved demonstrations for all target sentences into LLMs for prediction. We conduct experiments on four widely-adopted NER datasets, including both general and specific domains. Experimental results show that ConsistNER achieves a 6.01%-26.37% and 3.07%-21.18% improvement over the state-of-the-art baselines on Micro-F1 scores under 1- and 5-shot settings, respectively.

Published

2024-03-24

How to Cite

Wu, C., Ke, W., Wang, P., Luo, Z., Li, G., & Chen, W. (2024). ConsistNER: Towards Instructive NER Demonstrations for LLMs with the Consistency of Ontology and Context. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19234–19242. https://doi.org/10.1609/aaai.v38i17.29892

Issue

Section

AAAI Technical Track on Natural Language Processing II