An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction

Authors

  • Urchade Zaratiana FI Group, Puteaux, France LIPN - Université Sorbonne Paris Nord - CNRS UMR 7030, Villetaneuse, France
  • Nadi Tomeh LIPN - Université Sorbonne Paris Nord - CNRS UMR 7030, Villetaneuse, France
  • Pierre Holat FI Group, Puteaux, France LIPN - Université Sorbonne Paris Nord - CNRS UMR 7030, Villetaneuse, France
  • Thierry Charnois LIPN - Université Sorbonne Paris Nord - CNRS UMR 7030, Villetaneuse, France

DOI:

https://doi.org/10.1609/aaai.v38i17.29919

Keywords:

NLP: Information Extraction, NLP: Generation, NLP: (Large) Language Models

Abstract

In this paper, we propose a novel method for joint entity and relation extraction from unstructured text by framing it as a conditional sequence generation problem. In contrast to conventional generative information extraction models that are left-to-right token-level generators, our approach is \textit{span-based}. It generates a linearized graph where nodes represent text spans and edges represent relation triplets. Our method employs a transformer encoder-decoder architecture with pointing mechanism on a dynamic vocabulary of spans and relation types. Our model can capture the structural characteristics and boundaries of entities and relations through span representations while simultaneously grounding the generated output in the original text thanks to the pointing mechanism. Evaluation on benchmark datasets validates the effectiveness of our approach, demonstrating competitive results. Code is available at https://github.com/urchade/ATG.

Published

2024-03-24

How to Cite

Zaratiana, U., Tomeh, N., Holat, P., & Charnois, T. (2024). An Autoregressive Text-to-Graph Framework for Joint Entity and Relation Extraction. Proceedings of the AAAI Conference on Artificial Intelligence, 38(17), 19477-19487. https://doi.org/10.1609/aaai.v38i17.29919

Issue

Section

AAAI Technical Track on Natural Language Processing II