Multimodal Poisson Gamma Belief Network

Authors

  • Chaojie Wang Xidian University
  • Bo Chen Xidian University
  • Mingyuan Zhou The University of Texas at Austin

DOI:

https://doi.org/10.1609/aaai.v32i1.11846

Keywords:

Unsupervised Learning, Graphical Model Learning, Language and Vision

Abstract

To learn a deep generative model of multimodal data, we propose a multimodal Poisson gamma belief network (mPGBN) that tightly couple the data of different modalities at multiple hidden layers. The mPGBN unsupervisedly extracts a nonnegative latent representation using an upward-downward Gibbs sampler. It imposes sparse connections between different layers, making it simple to visualize the generative process and the relationships between the latent features of different modalities. Our experimental results on bi-modal data consisting of images and tags show that the mPGBN can easily impute a missing modality and hence is useful for both image annotation and retrieval. We further demonstrate that the mPGBN achieves state-of-the-art results on unsupervisedly extracting latent features from multimodal data.

Downloads

Published

2018-04-26

How to Cite

Wang, C., Chen, B., & Zhou, M. (2018). Multimodal Poisson Gamma Belief Network. Proceedings of the AAAI Conference on Artificial Intelligence, 32(1). https://doi.org/10.1609/aaai.v32i1.11846

Issue

Section

Main Track: Machine Learning Applications