Task-Specific Representation Learning for Web-Scale Entity Disambiguation

Authors

  • Rijula Kar IIT, Kharagpur
  • Susmija Reddy IIT, Kharagpur
  • Sourangshu Bhattacharya IIT, Kharagpur
  • Anirban Dasgupta IIT, Gandhinagar
  • Soumen Chakrabarti IIT, Bombay

DOI:

https://doi.org/10.1609/aaai.v32i1.12066

Keywords:

Information Extraction, Entity Disambiguation, Multitask Learning

Abstract

Named entity disambiguation (NED) is a central problem in information extraction. The goal is to link entities in a knowledge graph (KG) to their mention spans in unstructured text. Each distinct mention span (like John Smith, Jordan or Apache) represents a multi-class classification task. NED can therefore be modeled as a multitask problem with tens of millions of tasks for realistic KGs. We initiate an investigation into neural representations, network architectures, and training protocols for multitask NED. Specifically, we propose a task-sensitive representation learning framework that learns mention dependent representations, followed by a common classifier. Parameter learning in our framework can be decomposed into solving multiple smaller problems involving overlapping groups of tasks. We prove bounds for excess risk, which provide additional insight into the problem of multi-task representation learning. While remaining practical in terms of training memory and time requirements, our approach outperforms recent strong baselines, on four benchmark data sets.

Downloads

Published

2018-04-26

How to Cite

Kar, R., Reddy, S., Bhattacharya, S., Dasgupta, A., & Chakrabarti, S. (2018). Task-Specific Representation Learning for Web-Scale Entity Disambiguation. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.12066