Noisy population recovery in polynomial time

24 February 2016

Abstract

In the noisy population recovery problem of Dvir et al., the goal is to learn an unknown distribution $f$ on binary strings of length $n$ from noisy samples. For some parameter $\mu \in [0,1]$ , a noisy sample is generated by flipping each coordinate of a sample from $f$ independently with probability $(1-\mu)/2$ . We assume an upper bound $k$ on the size of the support of the distribution, and the goal is to estimate the probability of any string to within some given error $\varepsilon$ . It is known that the algorithmic complexity and sample complexity of this problem are polynomially related to each other. We show that for $\mu > 0$ , the sample complexity (and hence the algorithmic complexity) is bounded by a polynomial in $k$ , $n$ and $1/\varepsilon$ improving upon the previous best result of $\mathsf{poly}(k^{\log\log k},n,1/\varepsilon)$ due to Lovett and Zhang. Our proof combines ideas from Lovett and Zhang with a \emph{noise attenuated} version of M\"{o}bius inversion. In turn, the latter crucially uses the construction of \emph{robust local inverse} due to Moitra and Saks.

View on arXiv

Comments on this paper