Recovering Implicit Thread Structure in Newsgroup Style Conversations
On-line discussions are composed of multiple inter-woven threads, regardless of whether that threaded structure is made explicit in the representation and presentation of the conversational data. Recovering the thread structure is valuable since it makes it possible to isolate discussion related to specific subtopics or related to particular conversational goals. In prior work, thread structure has been reconstructed using explicit meta-data features such as posted by and reply to relationships. The contribution of this paper is a novel approach to recovering thread structure in discussion forums where this explicit meta-data is missing. This approach uses a graph-based representation of a collection of messages where connections between messages are postulated based on inter-message similarity. We evaluate three variations of this simple baseline approach that exploit in different ways the temporal relationships between messages. The results show that the three proposed approaches outperform the simple threshold-cut baseline.