Simplicity Bias in Overparameterized Machine Learning


  • Yakir Berchenko Ben-Gurion University of the Negev, Department of Industrial Engineering and Management



ML: Learning Theory, ML: Probabilistic Circuits and Graphical Models, ML: Deep Learning Theory


A thorough theoretical understanding of the surprising generalization ability of deep networks (and other overparameterized models) is still lacking. Here we demonstrate that simplicity bias is a major phenomenon to be reckoned with in overparameterized machine learning. In addition to explaining the outcome of simplicity bias, we also study its source: following concrete rigorous examples, we argue that (i) simplicity bias can explain generalization in overparameterized learning models such as neural networks; (ii) simplicity bias and excellent generalization are optimizer-independent, as our example shows, and although the optimizer affects training, it is not the driving force behind simplicity bias; (iii) simplicity bias in pre-training models, and subsequent posteriors, is universal and stems from the subtle fact that uniformly-at-random constructed priors are not uniformly-at-random sampled ; and (iv) in neural network models, the biasing mechanism in wide (and shallow) networks is different from the biasing mechanism in deep (and narrow) networks.



How to Cite

Berchenko, Y. (2024). Simplicity Bias in Overparameterized Machine Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 38(10), 11052-11060.



AAAI Technical Track on Machine Learning I