Cross-Lingual Taxonomy Alignment with Bilingual Biterm Topic Model

Authors

  • Tianxing Wu Southeast University
  • Guilin Qi Southeast University
  • Haofen Wang East China University of Science and Technology
  • Kang Xu Southeast University
  • Xuan Cui Southeast University

DOI:

https://doi.org/10.1609/aaai.v30i1.9979

Keywords:

Cross-lingual Taxonomy Alignment, Bilingual Biterm Topic Model, Vector Similarities

Abstract

As more and more multilingual knowledge becomes available on the Web, knowledge sharing across languages has become an important task to benefit many applications. One of the most crucial kinds of knowledge on the Web is taxonomy, which is used to organize and classify the Web data. To facilitate knowledge sharing across languages, we need to deal with the problem of cross-lingual taxonomy alignment, which discovers the most relevant category in the target taxonomy of one language for each category in the source taxonomy of another language. Current approaches for aligning cross-lingual taxonomies strongly rely on domain-specific information and the features based on string similarities. In this paper, we present a new approach to deal with the problem of cross-lingual taxonomy alignment without using any domain-specific information. We first identify the candidate matched categories in the target taxonomy for each category in the source taxonomy using the cross-lingual string similarity. We then propose a novel bilingual topic model, called Bilingual Biterm Topic Model (BiBTM), to perform exact matching. BiBTM is trained by the textual contexts extracted from the Web. We conduct experiments on two kinds of real world datasets. The experimental results show that our approach significantly outperforms the designed state-of-the-art comparison methods.

Downloads

Published

2016-02-21

How to Cite

Wu, T., Qi, G., Wang, H., Xu, K., & Cui, X. (2016). Cross-Lingual Taxonomy Alignment with Bilingual Biterm Topic Model. Proceedings of the AAAI Conference on Artificial Intelligence, 30(1). https://doi.org/10.1609/aaai.v30i1.9979