Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders

Authors

  • Yicheng Zou Fudan University
  • Jun Lin Alibaba Group
  • Lujun Zhao Alibaba Group
  • Yangyang Kang Alibaba Group
  • Zhuoren Jiang Zhejiang University
  • Changlong Sun Alibaba Group Zhejiang University
  • Qi Zhang Fudan University
  • Xuanjing Huang Fudan University
  • Xiaozhong Liu Indiana University Bloomington

Keywords:

Summarization, Applications

Abstract

Automatic chat summarization can help people quickly grasp important information from numerous chat messages. Unlike conventional documents, chat logs usually have fragmented and evolving topics. In addition, these logs contain a quantity of elliptical and interrogative sentences, which make the chat summarization highly context dependent. In this work, we propose a novel unsupervised framework called RankAE to perform chat summarization without employing manually labeled data. RankAE consists of a topic-oriented ranking strategy that selects topic utterances according to centrality and diversity simultaneously, as well as a denoising auto-encoder that is carefully designed to generate succinct but context-informative summaries based on the selected utterances. To evaluate the proposed method, we collect a large-scale dataset of chat logs from a customer service environment and build an annotated set only for model evaluation. Experimental results show that RankAE significantly outperforms other unsupervised methods and is able to generate high-quality summaries in terms of relevance and topic coverage.

Downloads

Published

2021-05-18

How to Cite

Zou, Y., Lin, J., Zhao, L., Kang, Y., Jiang, Z., Sun, C., Zhang, Q., Huang, X., & Liu, X. (2021). Unsupervised Summarization for Chat Logs with Topic-Oriented Ranking and Context-Aware Auto-Encoders. Proceedings of the AAAI Conference on Artificial Intelligence, 35(16), 14674-14682. Retrieved from https://ojs.aaai.org/index.php/AAAI/article/view/17724

Issue

Section

AAAI Technical Track on Speech and Natural Language Processing III