Acronym Disambiguation Using Word Embedding

Chao Li; Lei Ji; Jun Yan

doi:10.1609/aaai.v29i1.9713

Authors

Chao Li Dalian University of Technology
Lei Ji Microsoft Research Asia
Jun Yan Microsoft Research Asia

DOI:

https://doi.org/10.1609/aaai.v29i1.9713

Keywords:

Word Embedding, Acronym Disambiguation, Machine Learing

Abstract

According to the website AcronymFinder.com which is one of the world's largest and most comprehensive dictionaries of acronyms, an average of 37 new human-edited acronym definitions are added every day. There are 379,918 acronyms with 4,766,899 definitions on that site up to now, and each acronym has 12.5 definitions on average. It is a very important research topic to identify what exactly an acronym means in a given context for document comprehension as well as for document retrieval. In this paper, we propose two word embedding based models for acronym disambiguation. Word embedding is to represent words in a continuous and multidimensional vector space, so that it is easy to calculate the semantic similarity between words by calculating the vector distance. We evaluate the models on MSH Dataset and ScienceWISE Dataset, and both models outperform the state-of-art methods on accuracy. The experimental results show that word embedding helps to improve acronym disambiguation.

Acronym Disambiguation Using Word Embedding

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information