Universal Information Extraction as Unified Semantic Matching

Authors

  • Jie Lou Baidu, Inc.
  • Yaojie Lu Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
  • Dai Dai Baidu, Inc.
  • Wei Jia Baidu, Inc.
  • Hongyu Lin Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences
  • Xianpei Han Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences
  • Le Sun Chinese Information Processing Laboratory, Institute of Software, Chinese Academy of Sciences State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences
  • Hua Wu Baidu, Inc.

DOI:

https://doi.org/10.1609/aaai.v37i11.26563

Keywords:

SNLP: Information Extraction

Abstract

The challenge of information extraction (IE) lies in the diversity of label schemas and the heterogeneity of structures. Traditional methods require task-specific model design and rely heavily on expensive supervision, making them difficult to generalize to new schemas. In this paper, we decouple IE into two basic abilities, structuring and conceptualizing, which are shared by different tasks and schemas. Based on this paradigm, we propose to universally model various IE tasks with Unified Semantic Matching (USM) framework, which introduces three unified token linking operations to model the abilities of structuring and conceptualizing. In this way, USM can jointly encode schema and input text, uniformly extract substructures in parallel, and controllably decode target structures on demand. Empirical evaluation on 4 IE tasks shows that the proposed method achieves state-of-the-art performance under the supervised experiments and shows strong generalization ability in zero/few-shot transfer settings.

Downloads

Published

2023-06-26

How to Cite

Lou, J., Lu, Y., Dai, D., Jia, W., Lin, H., Han, X., Sun, L., & Wu, H. (2023). Universal Information Extraction as Unified Semantic Matching. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 13318-13326. https://doi.org/10.1609/aaai.v37i11.26563

Issue

Section

AAAI Technical Track on Speech & Natural Language Processing