Simplicity Bias in Overparameterized Machine Learning

Yakir Berchenko

doi:10.1609/aaai.v38i10.28981

Authors

Yakir Berchenko Ben-Gurion University of the Negev, Department of Industrial Engineering and Management

DOI:

https://doi.org/10.1609/aaai.v38i10.28981

Keywords:

ML: Learning Theory, ML: Probabilistic Circuits and Graphical Models, ML: Deep Learning Theory

Abstract

A thorough theoretical understanding of the surprising generalization ability of deep networks (and other overparameterized models) is still lacking. Here we demonstrate that simplicity bias is a major phenomenon to be reckoned with in overparameterized machine learning. In addition to explaining the outcome of simplicity bias, we also study its source: following concrete rigorous examples, we argue that (i) simplicity bias can explain generalization in overparameterized learning models such as neural networks; (ii) simplicity bias and excellent generalization are optimizer-independent, as our example shows, and although the optimizer affects training, it is not the driving force behind simplicity bias; (iii) simplicity bias in pre-training models, and subsequent posteriors, is universal and stems from the subtle fact that uniformly-at-random constructed priors are not uniformly-at-random sampled ; and (iv) in neural network models, the biasing mechanism in wide (and shallow) networks is different from the biasing mechanism in deep (and narrow) networks.

Simplicity Bias in Overparameterized Machine Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription