Simplicity Bias in Overparameterized Machine Learning
DOI:
https://doi.org/10.1609/aaai.v38i10.28981Keywords:
ML: Learning Theory, ML: Probabilistic Circuits and Graphical Models, ML: Deep Learning TheoryAbstract
A thorough theoretical understanding of the surprising generalization ability of deep networks (and other overparameterized models) is still lacking. Here we demonstrate that simplicity bias is a major phenomenon to be reckoned with in overparameterized machine learning. In addition to explaining the outcome of simplicity bias, we also study its source: following concrete rigorous examples, we argue that (i) simplicity bias can explain generalization in overparameterized learning models such as neural networks; (ii) simplicity bias and excellent generalization are optimizer-independent, as our example shows, and although the optimizer affects training, it is not the driving force behind simplicity bias; (iii) simplicity bias in pre-training models, and subsequent posteriors, is universal and stems from the subtle fact that uniformly-at-random constructed priors are not uniformly-at-random sampled ; and (iv) in neural network models, the biasing mechanism in wide (and shallow) networks is different from the biasing mechanism in deep (and narrow) networks.Downloads
Published
2024-03-24
How to Cite
Berchenko, Y. (2024). Simplicity Bias in Overparameterized Machine Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11052-11060. https://doi.org/10.1609/aaai.v38i10.28981
Issue
Section
AAAI Technical Track on Machine Learning I