Regression-Based Summarization of Email Conversations

Authors

  • Jan Ulrich University of British Columbia
  • Giuseppe Carenini University of British Columbia
  • Gabriel Murray University of British Columbia
  • Raymond Ng University of British Columbia

DOI:

https://doi.org/10.1609/icwsm.v3i1.13980

Keywords:

email, summarization, regression, machine learning, corpus

Abstract

In this paper we present a regression-based machine learning approach to email thread summarization. The regression model is able to take advantage of multiple gold-standard annotations for training purposes, in contrast to most work with binary classifiers.  We also investigate the usefulness of novel features such as speech acts. This paper also introduces a newly created and publicly available email corpus for summarization research.  We show that regression-based classifiers perform better than binary classifiers because they preserve more information about annotator judgements. In our comparison between different regression-based classifiers, we found that Bagging and Gaussian Processes have the highest weighted recall.

Downloads

Published

2009-03-20

How to Cite

Ulrich, J., Carenini, G., Murray, G., & Ng, R. (2009). Regression-Based Summarization of Email Conversations. Proceedings of the International AAAI Conference on Web and Social Media, 3(1), 334-337. https://doi.org/10.1609/icwsm.v3i1.13980