Estimating the True Distribution of Data Collected with Randomized Response

Carlos Antonio Pinzón; Ehab ElSalamouny; Lucas Massot; Alexis Miller; Héber Hwang Arcolezi; Catuscia Palamidessi

doi:10.1609/aaai.v40i42.40888

Estimating the True Distribution of Data Collected with Randomized Response

Authors

Carlos Antonio Pinzón INRIA
Ehab ElSalamouny INRIA
Lucas Massot École Polytechnique
Alexis Miller Ecole Normale Supérieure de Lyon
Héber Hwang Arcolezi INRIA
Catuscia Palamidessi INRIA

DOI:

https://doi.org/10.1609/aaai.v40i42.40888

Abstract

Randomized Response (RR) is a protocol designed to collect and analyze categorical data with local differential privacy guarantees. It has been used as a building block of mechanisms deployed by Big tech companies to collect app or web users' data. Each user reports an automatic random alteration of their true value to the analytics server, which then estimates the histogram of the true unseen values of all users using a debiasing rule to compensate for the added randomness. A known issue is that the standard debiasing rule can yield a vector with negative values (which can not be interpreted as a histogram), and there is no consensus on the best fix. An elegant but slow solution is the Iterative Bayesian Update algorithm (IBU), which converges to the Maximum Likelihood Estimate (MLE) as the number of iterations goes to infinity. This paper bypasses IBU by providing a simple formula for the exact MLE of RR and compares it with other estimation methods experimentally to help practitioners decide which one to use.

AAAI-26 / IAAI-26 / EAAI-26 Proceedings Cover

Downloads

PDF
Poster

Published

2026-03-14

How to Cite

Pinzón, C. A., ElSalamouny, E., Massot, L., Miller, A., Hwang Arcolezi, H., & Palamidessi, C. (2026). Estimating the True Distribution of Data Collected with Randomized Response. Proceedings of the AAAI Conference on Artificial Intelligence, 40(42), 35751–35758. https://doi.org/10.1609/aaai.v40i42.40888

Download Citation

Issue

Vol. 40 No. 42: AAAI-26 Technical Tracks 42

Section

AAAI Technical Track on Philosophy and Ethics of AI

Estimating the True Distribution of Data Collected with Randomized Response

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information