Simplicity Bias in Overparameterized Machine Learning

Authors

  • Yakir Berchenko Ben-Gurion University of the Negev, Department of Industrial Engineering and Management

DOI:

https://doi.org/10.1609/aaai.v38i10.28981

Keywords:

ML: Learning Theory, ML: Probabilistic Circuits and Graphical Models, ML: Deep Learning Theory

Abstract

A thorough theoretical understanding of the surprising generalization ability of deep networks (and other overparameterized models) is still lacking. Here we demonstrate that simplicity bias is a major phenomenon to be reckoned with in overparameterized machine learning. In addition to explaining the outcome of simplicity bias, we also study its source: following concrete rigorous examples, we argue that (i) simplicity bias can explain generalization in overparameterized learning models such as neural networks; (ii) simplicity bias and excellent generalization are optimizer-independent, as our example shows, and although the optimizer affects training, it is not the driving force behind simplicity bias; (iii) simplicity bias in pre-training models, and subsequent posteriors, is universal and stems from the subtle fact that uniformly-at-random constructed priors are not uniformly-at-random sampled ; and (iv) in neural network models, the biasing mechanism in wide (and shallow) networks is different from the biasing mechanism in deep (and narrow) networks.

Published

2024-03-24

How to Cite

Berchenko, Y. (2024). Simplicity Bias in Overparameterized Machine Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11052-11060. https://doi.org/10.1609/aaai.v38i10.28981

Issue

Section

AAAI Technical Track on Machine Learning I