100
73
v1v2 (latest)

Testing probability distributions using conditional samples

Abstract

We study a new framework for property testing of probability distributions, by considering distribution testing algorithms that have access to a conditional sampling oracle.* This is an oracle that takes as input a subset S[N]S \subseteq [N] of the domain [N][N] of the unknown probability distribution DD and returns a draw from the conditional probability distribution DD restricted to SS. This new model allows considerable flexibility in the design of distribution testing algorithms; in particular, testing algorithms in this model can be adaptive. We study a wide range of natural distribution testing problems in this new framework and some of its variants, giving both upper and lower bounds on query complexity. These problems include testing whether DD is the uniform distribution U\mathcal{U}; testing whether D=DD = D^\ast for an explicitly provided DD^\ast; testing whether two unknown distributions D1D_1 and D2D_2 are equivalent; and estimating the variation distance between DD and the uniform distribution. At a high level our main finding is that the new "conditional sampling" framework we consider is a powerful one: while all the problems mentioned above have Ω(N)\Omega(\sqrt{N}) sample complexity in the standard model (and in some cases the complexity must be almost linear in NN), we give poly(logN,1/ε)\mathrm{poly}(\log N, 1/\varepsilon)-query algorithms (and in some cases poly(1/ε)\mathrm{poly}(1/\varepsilon)-query algorithms independent of NN) for all these problems in our conditional sampling setting. *Independently from our work, Chakraborty et al. also considered this framework. We discuss their work in Subsection [1.4].

View on arXiv
Comments on this paper