Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures

Authors

  • Ian Porteous University of California Irvine
  • Arthur Asuncion University of California Irvine
  • Max Welling University of California Irvine

DOI:

https://doi.org/10.1609/aaai.v24i1.7686

Keywords:

Matrix Factorization, Bayesian, Collaborative Filtering

Abstract

Matrix factorization is a fundamental technique in machine learning that is applicable to collaborative filtering, information retrieval and many other areas. In collaborative filtering and many other tasks, the objective is to fill in missing elements of a sparse data matrix. One of the biggest challenges in this case is filling in a column or row of the matrix with very few observations. In this paper we introduce a Bayesian matrix factorization model that performs regression against side information known about the data in addition to the observations. The side information helps by adding observed entries to the factored matrices. We also introduce a nonparametric mixture model for the prior of the rows and columns of the factored matrices that gives a different regularization for each latent class. Besides providing a richer prior, the posterior distribution of mixture assignments reveals the latent classes. Using Gibbs sampling for inference, we apply our model to the Netflix Prize problem of predicting movie ratings given an incomplete user-movie ratings matrix. Incorporating rating information with gathered metadata information, our Bayesian approach outperforms other matrix factorization techniques even when using fewer dimensions.

Downloads

Published

2010-07-03

How to Cite

Porteous, I., Asuncion, A., & Welling, M. (2010). Bayesian Matrix Factorization with Side Information and Dirichlet Process Mixtures. Proceedings of the AAAI Conference on Artificial Intelligence, 24(1), 563-568. https://doi.org/10.1609/aaai.v24i1.7686