Cross-Lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis


  • Zhigang Wang Tsinghua University
  • Juanzi Li Tsinghua University
  • Shuangjie Li Tsinghua University
  • Mingyang Li Tsinghua University
  • Jie Tang Tsinghua University
  • Kuo Zhang Sogou Inc.
  • Kun Zhang Sogou Inc.



Creating knowledge bases based on the crowd-sourced wikis, like Wikipedia, has attracted significant research interest in the field of intelligent Web. However, the derived taxonomies usually contain many mistakenly imported taxonomic relations due to the difference between the user-generated subsumption relations and the semantic taxonomic relations. Current approaches to solving the problem still suffer the following issues: (i) the heuristic-based methods strongly rely on specific language dependent rules. (ii) the corpus-based methods depend on a large-scale high-quality corpus, which is often unavailable. In this paper, we formulate the cross-lingual taxonomy derivation problem as the problem of cross-lingual taxonomic relation prediction. We investigate different linguistic heuristics and language independent features, and propose a cross-lingual knowledge validation based dynamic adaptive boosting model to iteratively reinforce the performance of taxonomic relation prediction. The proposed approach successfully overcome the above issues, and experiments show that our approach significantly outperforms the designed state-of-the-art comparison methods.




How to Cite

Wang, Z., Li, J., Li, S., Li, M., Tang, J., Zhang, K., & Zhang, K. (2014). Cross-Lingual Knowledge Validation Based Taxonomy Derivation from Heterogeneous Online Wikis. Proceedings of the AAAI Conference on Artificial Intelligence, 28(1).