GraphER: Token-Centric Entity Resolution with Graph Convolutional Neural Networks

Authors

  • Bing Li University of New South Wales
  • Wei Wang University of New South Wales
  • Yifang Sun University of New South Wales
  • Linhan Zhang University of New South Wales
  • Muhammad Asif Ali University of New South Wales
  • Yi Wang Dongguan University of Technology

DOI:

https://doi.org/10.1609/aaai.v34i05.6330

Abstract

Entity resolution (ER) aims to identify entity records that refer to the same real-world entity, which is a critical problem in data cleaning and integration. Most of the existing models are attribute-centric, that is, matching entity pairs by comparing similarities of pre-aligned attributes, which require the schemas of records to be identical and are too coarse-grained to capture subtle key information within a single attribute. In this paper, we propose a novel graph-based ER model GraphER. Our model is token-centric: the final matching results are generated by directly aggregating token-level comparison features, in which both the semantic and structural information has been softly embedded into token embeddings by training an Entity Record Graph Convolutional Network (ER-GCN). To the best of our knowledge, our work is the first effort to do token-centric entity resolution with the help of GCN in entity resolution task. Extensive experiments on two real-world datasets demonstrate that our model stably outperforms state-of-the-art models.

Downloads

Published

2020-04-03

How to Cite

Li, B., Wang, W., Sun, Y., Zhang, L., Ali, M. A., & Wang, Y. (2020). GraphER: Token-Centric Entity Resolution with Graph Convolutional Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 34(05), 8172-8179. https://doi.org/10.1609/aaai.v34i05.6330

Issue

Section

AAAI Technical Track: Natural Language Processing