All-Purpose Mean Estimation over R

Authors

  • Jasper C.H. Lee University of California, Davis

DOI:

https://doi.org/10.1609/aaai.v40i47.41348

Abstract

Given society's increasing reliance on data, its collection and processing into useful information is a technical problem of growing focus, and perhaps paradoxically, a critical bottleneck in many data science and machine learning applications. Yet, even for the most basic statistical problems such as mean estimation, there is a theory-practice divide. Conventional methods like the sample mean, while supported by theoretical results under strong assumptions, are often brittle in the presence of extreme data. Practitioners thus often use ad-hoc and unprincipled "outlier removal" heuristics, but which can lead to wrong conclusions (e.g. Milikan's underestimation of the electron charge). In this talk, I will describe my work that essentially resolves the fundamental 1-d mean estimation problem. I will show the construction of a statistically-optimal and computationally-efficient 1-dimensional mean estimator, whose estimation error is optimal even in the leading multiplicative constant, under bare minimum distributional assumptions (FOCS 2021). Furthermore, I will discuss its various robustness properties (ICML 2025 Oral), in particular highlighting robustness to adversarial sample corruption.

Published

2026-03-14

How to Cite

Lee, J. C. (2026). All-Purpose Mean Estimation over R. Proceedings of the AAAI Conference on Artificial Intelligence, 40(47), 39823–39824. https://doi.org/10.1609/aaai.v40i47.41348