Semantic Structure-Based Word Embedding by Incorporating Concept Convergence and Word Divergence
Keywords:Word embedding, nature language processing
Representing the semantics of words is a fundamental task in text processing. Several research studies have shown that text and knowledge bases (KBs) are complementary sources for word embedding learning. Most existing methods only consider relationships within word-pairs in the usage of KBs. We argue that the structural information of well-organized words within the KBs is able to convey more effective and stable knowledge in capturing semantics of words. In this paper, we propose a semantic structure-based word embedding method, and introduce concept convergence and word divergence to reveal semantic structures in the word embedding learning process. To assess the effectiveness of our method, we use WordNet for training and conduct extensive experiments on word similarity, word analogy, text classification and query expansion. The experimental results show that our method outperforms state-of-the-art methods, including the methods trained solely on the corpus, and others trained on the corpus and the KBs.