End-to-End Deep Reinforcement Learning for Conversation Disentanglement


  • Karan Bhukar IBM Research
  • Harshit Kumar IBM Research
  • Dinesh Raghu IBM Research
  • Ajay Gupta Meta




SNLP: Conversational AI/Dialogue Systems, ML: Reinforcement Learning Algorithms


Collaborative Communication platforms (e.g., Slack) support multi-party conversations which contain a large number of messages on shared channels. Multiple conversations intermingle within these messages. The task of conversation disentanglement is to cluster these intermingled messages into conversations. Existing approaches are trained using loss functions that optimize only local decisions, i.e. predicting reply-to links for each message and thereby creating clusters of conversations. In this work, we propose an end-to-end reinforcement learning (RL) approach that directly optimizes a global metric. We observe that using existing global metrics such as variation of information and adjusted rand index as a reward for the RL agent deteriorates its performance. This behaviour is because these metrics completely ignore the reply-to links between messages (local decisions) during reward computation. Therefore, we propose a novel thread-level reward function that captures the global metric without ignoring the local decisions. Through experiments on the Ubuntu IRC dataset, we demonstrate that the proposed RL model improves the performance on both link-level and conversation-level metrics.




How to Cite

Bhukar, K., Kumar, H., Raghu, D., & Gupta, A. (2023). End-to-End Deep Reinforcement Learning for Conversation Disentanglement. Proceedings of the AAAI Conference on Artificial Intelligence, 37(11), 12571-12579. https://doi.org/10.1609/aaai.v37i11.26480



AAAI Technical Track on Speech & Natural Language Processing