Hierarchical Methods for a Unified Approach to Discourse, Domain, and Style in Neural Conversational Models
With the advent of personal assistants such as Siri and Alexa, there has been a renewed focus on dialog systems, specifically open domain conversational agents. Dialog is a challenging problem since it spans multiple conversational turns. To further complicate the problem, there are many contextual cues and valid possible utterances. Dialog is fundamentally a multiscale process given that context is carried from previous utterances in the conversation; however, current neural methods lack the ability to carry human-like conversation. Neural dialog models are based on recurrent neural network Encoder-Decoder sequence-to-sequence models (Sutskever, Vinyals, and Le, 2014; Bahdanau, Cho, and Bengio, 2015). However, these models lack the ability to create temporal and stylistic coherence in conversations. We propose to incorporate dialog acts (such as Statement-non-opinion ["Me, I'm in the legal department."], Acknowledge ["Uh-huh."]) and discourse connectives (e.g. "because," "then"), utterance clustering and domain prediction, and style shifting using hierarchical methods. In particular, we show that clustering of utterance representations automatically allows for a unified hierarchical approach to discourse, domain, and style.