Sparse Group Restricted Boltzmann Machines

Authors

  • Heng Luo Shanghai Jiao Tong University
  • Ruimin Shen Shanghai Jiao Tong University
  • Changyong Niu Zhengzhou University
  • Carsten Ullrich Shanghai Jiao Tong University

DOI:

https://doi.org/10.1609/aaai.v25i1.7923

Abstract

Since learning in Boltzmann machines is typically quite slow, there is a need to restrict connections within hidden layers. However, theresulting states of hidden units exhibit statistical dependencies. Based on this observation, we propose using l1/l2 regularization upon the activation probabilities of hidden units in restricted Boltzmann machines to capture the local dependencies among hidden units. This regularization not only encourages hidden units of many groups to be inactive given observed data but also makes hidden units within a group compete with each other for modeling observed data. Thus, the l1/l2 regularization on RBMs yields sparsity at both the group and the hidden unit levels. We call RBMs trained with the regularizer sparse group RBMs (SGRBMs). The proposed SGRBMs are appliedto model patches of natural images, handwritten digits and OCR English letters. Then to emphasize that SGRBMs can learn more discriminative features we applied SGRBMs to pretrain deep networks for classification tasks. Furthermore, we illustrate the regularizer can also be applied to deep Boltzmann machines, which lead to sparse group deep Boltzmann machines. When adapted to the MNIST data set, a two-layer sparse group Boltzmann machine achieves an error rate of 0.84%, which is, to our knowledge, the best published result on the permutation-invariant version of the MNIST task.

Downloads

Published

2011-08-04

How to Cite

Luo, H., Shen, R., Niu, C., & Ullrich, C. (2011). Sparse Group Restricted Boltzmann Machines. Proceedings of the AAAI Conference on Artificial Intelligence, 25(1), 429-434. https://doi.org/10.1609/aaai.v25i1.7923

Issue

Section

AAAI Technical Track: Machine Learning