From Generating to Mining: Automatically Scripting Conversations Using Existing Online Sources
Keywords:natural language generation, machine-generated content
Hearing people argue opposing sides of an issue can be a useful way to understand the topic; however, these debates or conversations often don’t exist. Unfortunately, generating interesting natural language conversations is a difficult problem and typically requires a deep model of both a domain and its language. Fortunately, there is a huge amount of interesting text, written both by professional writers and amateurs, already available on the web. In this paper, we describe a system that builds compelling conversations between two characters—not by generating wholly new natural language, but by gathering, assembling, and processing existing online textual content. Our initial system authors conversations between two simulated movie reviewers, in a style similar to “Siskel and Ebert.” Using various online repositories, the system searches for a variety of facts and opinions about a given film. The system then uses this mined data to choose between various conversational templates and construct the dialogue. Using this information, the system is able to generate natural-sounding, colorful conversations and arguments without a deep representation of the subject being discussed.