Linear Queries Estimation with Local Differential Privacy

We study the problem of estimating a set of linear queries with respect to some unknown distribution over a domain based on a sensitive data set of individuals under the constraint of local differential privacy. This problem subsumes a wide range of estimation tasks, e.g., distribution estimation and -dimensional mean estimation. We provide new algorithms for both the offline (non-adaptive) and adaptive versions of this problem. In the offline setting, the set of queries are fixed before the algorithm starts. In the regime where , our algorithms attain estimation error that is independent of , and is tight up to a factor of . For the special case of distribution estimation, we show that projecting the output estimate of an algorithm due to [Acharya et al. 2018] on the probability simplex yields an error that depends only sub-logarithmically on in the regime where . These results show the possibility of accurate estimation of linear queries in the high-dimensional settings under the error criterion. In the adaptive setting, the queries are generated over rounds; one query at a time. In each round, a query can be chosen adaptively based on all the history of previous queries and answers. We give an algorithm for this problem with optimal estimation error (worst error in the estimated values for the queries w.r.t. the data distribution). Our bound matches a lower bound on the error for the offline version of this problem [Duchi et al. 2013].
View on arXiv