Task-Specific Representation Learning for Web-Scale Entity Disambiguation

Rijula Kar; Susmija Reddy; Sourangshu Bhattacharya; Anirban Dasgupta; Soumen Chakrabarti

doi:10.1609/aaai.v32i1.12066

Authors

Rijula Kar IIT, Kharagpur
Susmija Reddy IIT, Kharagpur
Sourangshu Bhattacharya IIT, Kharagpur
Anirban Dasgupta IIT, Gandhinagar
Soumen Chakrabarti IIT, Bombay

DOI:

https://doi.org/10.1609/aaai.v32i1.12066

Keywords:

Information Extraction, Entity Disambiguation, Multitask Learning

Abstract

Named entity disambiguation (NED) is a central problem in information extraction. The goal is to link entities in a knowledge graph (KG) to their mention spans in unstructured text. Each distinct mention span (like John Smith, Jordan or Apache) represents a multi-class classification task. NED can therefore be modeled as a multitask problem with tens of millions of tasks for realistic KGs. We initiate an investigation into neural representations, network architectures, and training protocols for multitask NED. Specifically, we propose a task-sensitive representation learning framework that learns mention dependent representations, followed by a common classifier. Parameter learning in our framework can be decomposed into solving multiple smaller problems involving overlapping groups of tasks. We prove bounds for excess risk, which provide additional insight into the problem of multi-task representation learning. While remaining practical in terms of training memory and time requirements, our approach outperforms recent strong baselines, on four benchmark data sets.

Task-Specific Representation Learning for Web-Scale Entity Disambiguation

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription