All-Purpose Mean Estimation over R

Jasper C.H. Lee

doi:10.1609/aaai.v40i47.41348

Authors

Jasper C.H. Lee University of California, Davis

DOI:

https://doi.org/10.1609/aaai.v40i47.41348

Abstract

Given society's increasing reliance on data, its collection and processing into useful information is a technical problem of growing focus, and perhaps paradoxically, a critical bottleneck in many data science and machine learning applications. Yet, even for the most basic statistical problems such as mean estimation, there is a theory-practice divide. Conventional methods like the sample mean, while supported by theoretical results under strong assumptions, are often brittle in the presence of extreme data. Practitioners thus often use ad-hoc and unprincipled "outlier removal" heuristics, but which can lead to wrong conclusions (e.g. Milikan's underestimation of the electron charge). In this talk, I will describe my work that essentially resolves the fundamental 1-d mean estimation problem. I will show the construction of a statistically-optimal and computationally-efficient 1-dimensional mean estimator, whose estimation error is optimal even in the leading multiplicative constant, under bare minimum distributional assumptions (FOCS 2021). Furthermore, I will discuss its various robustness properties (ICML 2025 Oral), in particular highlighting robustness to adversarial sample corruption.

All-Purpose Mean Estimation over R

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information