When and Why Are Deep Networks Better Than Shallow Ones?

Hrushikesh Mhaskar; Qianli Liao; Tomaso Poggio

doi:10.1609/aaai.v31i1.10913

When and Why Are Deep Networks Better Than Shallow Ones?

Authors

Hrushikesh Mhaskar California Institute of Technology
Qianli Liao Massachusetts Institute of Technology
Tomaso Poggio Massachusetts Institute of Technology

DOI:

https://doi.org/10.1609/aaai.v31i1.10913

Keywords:

deep learning, shallow and deep networks, function approximation

Abstract

While the universal approximation property holds both for hierarchical and shallow networks, deep networks can approximate the class of compositional functions as well as shallow networks but with exponentially lower number of training parameters and sample complexity. Compositional functions are obtained as a hierarchy of local constituent functions, where "local functions'' are functions with low dimensionality. This theorem proves an old conjecture by Bengio on the role of depth in networks, characterizing precisely the conditions under which it holds. It also suggests possible answers to the the puzzle of why high-dimensional deep networks trained on large training sets often do not seem to show overfit.

Downloads

Published

2017-02-13

How to Cite

Mhaskar, H., Liao, Q., & Poggio, T. (2017). When and Why Are Deep Networks Better Than Shallow Ones?. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1). https://doi.org/10.1609/aaai.v31i1.10913

Download Citation

Issue

Vol. 31 No. 1 (2017): Thirty-First AAAI Conference on Artificial Intelligence

Section

Machine Learning Methods

When and Why Are Deep Networks Better Than Shallow Ones?

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information