Commonsense Knowledge Augmentation for Low-Resource Languages via Adversarial Learning

Bosung Kim; Juae Kim; Youngjoong Ko; Jungyun Seo

doi:10.1609/aaai.v35i7.16793

Authors

Bosung Kim Sungkyunkwan University
Juae Kim Sogang University Hyundai Motor Group
Youngjoong Ko Sungkyunkwan University
Jungyun Seo Sogang University

DOI:

https://doi.org/10.1609/aaai.v35i7.16793

Keywords:

Common-Sense Reasoning, Knowledge Acquisition, Adversarial Learning & Robustness, Question Answering

Abstract

Commonsense reasoning is one of the ultimate goals of artificial intelligence research because it simulates the human thinking process. However, most commonsense reasoning studies have focused on English because available commonsense knowledge for low-resource languages is scarce due to high construction costs. Translation is one of the typical methods for augmenting data for low-resource languages; however, translation entails ambiguity problems, where one word can be translated into multiple words due to polysemes and homonyms. Previous studies have suggested methods to measure the validity of translated multiple triples by using additional metadata and manually labeled data. However, such handcrafted datasets are not available for many low-resource languages. In this paper, we propose a knowledge augmentation method using adversarial networks that does not require any labeled data. Our adversarial networks can transfer knowledge learned from a resource-rich language to low-resource languages and thus measure the validity score of translated triples even without labeled data. We designed experiments to demonstrate that high-scoring triples obtained by the proposed model can be considered augmented knowledge. The experimental results show that our proposed method for a low-resource language, Korean, achieved 93.7% precision@1 on a manually labeled benchmark. Furthermore, to verify our model for other low-resource languages, we introduced new test sets for knowledge validation in 16 different languages. Our adversarial model obtains strong results for all language test sets. We will release the augmented Korean knowledge and test sets for 16 languages.

Commonsense Knowledge Augmentation for Low-Resource Languages via Adversarial Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Subscription