Semi-Distantly Supervised Neural Model for Generating Compact Answers to Open-Domain Why Questions

Ryo Ishida; Kentaro Torisawa; Jong-Hoon Oh; Ryu Iida; Canasai Kruengkrai; Julien Kloetzer

doi:10.1609/aaai.v32i1.12064

Authors

Ryo Ishida National Institute of Information and Communications Technology
Kentaro Torisawa National Institute of Information and Communications Technology
Jong-Hoon Oh National Institute of Information and Communications Technology
Ryu Iida National Institute of Information and Communications Technology
Canasai Kruengkrai National Institute of Information and Communications Technology
Julien Kloetzer National Institute of Information and Communications Technology

DOI:

https://doi.org/10.1609/aaai.v32i1.12064

Keywords:

Summarization, Deep Learning, Neural Networks, Query, Question, Distant Supervision

Abstract

This paper proposes a neural network-based method for generating compact answers to open-domain why-questions (e.g., "Why was Mr. Trump elected as the president of the US?"). Unlike factoid question answering methods that provide short text spans as answers, existing work for why-question answering have aimed at answering questions by retrieving relatively long text passages, each of which often consists of several sentences, from a text archive. While the actual answer to a why-question may be expressed over several consecutive sentences, these often contain redundant and/or unrelated parts. Such answers would not be suitable for spoken dialog systems and smart speakers such as Amazon Echo, which receive much attention in these days. In this work, we aim at generating non-redundant compact answers to why-questions from answer passages retrieved from a very large web data corpora (4 billion web pages) by an already existing open-domain why-question answering system, using a novel neural network obtained by extending existing summarization methods. We also automatically generate training data using a large number of causal relations automatically extracted from 4 billion web pages by an existing supervised causality recognizer. The data is used to train our neural network, together with manually created training data. Through a series of experiments, we show that both our novel neural network and auto-generated training data improve the quality of the generated answers both in ROUGE score and in a subjective evaluation.

Semi-Distantly Supervised Neural Model for Generating Compact Answers to Open-Domain Why Questions

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information