Latent Set Models for Two-Mode Network Data

Authors

  • Christopher DuBois University of California, Irvine
  • James Foulds University of California, Irvine
  • Padhraic Smyth University of California, Irvine

DOI:

https://doi.org/10.1609/icwsm.v5i1.14131

Abstract

Two-mode networks are a natural representation for many kinds of relational data. These networks are bipartite graphs consisting of two distinct sets ("modes") of entities. For example, one can model multiple recipient email data as a two-mode network of (a) individuals and (b) the emails that they send or receive. In this work we present a statistical model for two-mode network data which posits that individuals belong to latent sets and that the members of a particular set tend to co-appear. We show how to infer these latent sets from observed data using a Markov chain Monte Carlo inference algorithm. We apply the model to the Enron email corpus, using it to discover interpretable latent structure as well as evaluating its predictive accuracy on a missing data task. Extensions to the model are discussed that incorporate additional side information such as the email's sender or text content, further improving the accuracy of the model.

Downloads

Published

2021-08-03

How to Cite

DuBois, C., Foulds, J., & Smyth, P. (2021). Latent Set Models for Two-Mode Network Data. Proceedings of the International AAAI Conference on Web and Social Media, 5(1), 137-144. https://doi.org/10.1609/icwsm.v5i1.14131