22
178

Optimal Schemes for Discrete Distribution Estimation under Locally Differential Privacy

Abstract

We consider the minimax estimation problem of a discrete distribution with support size kk under privacy constraints. A privatization scheme is applied to each raw sample independently, and we need to estimate the distribution of the raw samples from the privatized samples. A positive number ϵ\epsilon measures the privacy level of a privatization scheme. For a given ϵ,\epsilon, we consider the problem of constructing optimal privatization schemes with ϵ\epsilon-privacy level, i.e., schemes that minimize the expected estimation loss for the worst-case distribution. Two schemes in the literature provide order optimal performance in the high privacy regime where ϵ\epsilon is very close to 0,0, and in the low privacy regime where eϵk,e^{\epsilon}\approx k, respectively. In this paper, we propose a new family of schemes which substantially improve the performance of the existing schemes in the medium privacy regime when 1eϵk.1\ll e^{\epsilon} \ll k. More concretely, we prove that when 3.8<ϵ<ln(k/9),3.8 < \epsilon <\ln(k/9) , our schemes reduce the expected estimation loss by 50%50\% under 22\ell_2^2 metric and by 30%30\% under 1\ell_1 metric over the existing schemes. We also prove a lower bound for the region eϵk,e^{\epsilon} \ll k, which implies that our schemes are order optimal in this regime.

View on arXiv
Comments on this paper