A Simple Yet Effective Subsequence-Enhanced Approach for Cross-Domain NER

Authors

  • Jinpeng Hu Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China
  • DanDan Guo The Chinese University of Hong Kong, Shenzhen
  • Yang Liu Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China
  • Zhuo Li Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China
  • Zhihong Chen Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China
  • Xiang Wan Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China Pazhou Lab, Guangzhou, 510330, China
  • Tsung-Hui Chang Shenzhen Research Institute of Big Data, The Chinese University of Hong Kong, Shenzhen, Guangdong, China The Chinese University of Hong Kong, Shenzhen

DOI:

https://doi.org/10.1609/aaai.v37i11.26515

Keywords:

SNLP: Information Extraction, ML: Transfer, Domain Adaptation, Multi-Task Learning

Abstract

Cross-domain named entity recognition (NER), aiming to address the limitation of labeled resources in the target domain, is a challenging yet important task. Most existing studies alleviate the data discrepancy across different domains at the coarse level via combing NER with language modelings or introducing domain-adaptive pre-training (DAPT). Notably, source and target domains tend to share more fine-grained local information within denser subsequences than global information within the whole sequence, such that subsequence features are easier to transfer, which has not been explored well. Besides, compared to token-level representation, subsequence-level information can help the model distinguish different meanings of the same word in different domains. In this paper, we propose to incorporate subsequence-level features for promoting the cross-domain NER. In detail, we first utilize a pre-trained encoder to extract the global information. Then, we re-express each sentence as a group of subsequences and propose a novel bidirectional memory recurrent unit (BMRU) to capture features from the subsequences. Finally, an adaptive coupling unit (ACU) is proposed to combine global information and subsequence features for predicting entity labels. Experimental results on several benchmark datasets illustrate the effectiveness of our model, which achieves considerable improvements.

Downloads

Published

2023-06-26

How to Cite

Hu, J., Guo, D., Liu, Y., Li, Z., Chen, Z., Wan, X., & Chang, T.-H. (2023). A Simple Yet Effective Subsequence-Enhanced Approach for Cross-Domain NER. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 12890-12898. https://doi.org/10.1609/aaai.v37i11.26515

Issue

Section

AAAI Technical Track on Speech & Natural Language Processing