226

Tolerant Distribution Testing in the Conditional Sampling Model

ACM-SIAM Symposium on Discrete Algorithms (SODA), 2020
Abstract

Recently, there has been significant work studying distribution testing under the Conditional Sampling model. In this model, a query involves specifying a subset SS of the domain, and the output received is a sample drawn from the distribution conditioned on being in SS. In this paper, we primarily study the \emph{tolerant} versions of the classic \emph{uniformity} and \emph{identity} testing problems, providing improved query complexity bounds in the conditional sampling model. In this paper, we prove that tolerant uniformity testing in the conditional sampling model can be solved using O~(ε2)\tilde{O}(\varepsilon^{-2}) queries, which is known to be optimal and improves upon the O~(ε20)\tilde{O}(\varepsilon^{-20})-query algorithm of [CRS15]. Our bound even holds under the Pair Conditional Sampling model, a restricted version of the conditional sampling model where every queried subset SS either must have exactly 22 elements, or must be the entire domain of the distribution. We also prove that tolerant identity testing in the conditional sampling model can be solved in O~(ε4)\tilde{O}(\varepsilon^{-4}) queries, which is the first known bound independent of the support size of the distribution for this problem. Finally, we study (non-tolerant) identity testing under the pair conditional sampling model, and provide a tight bound of Θ~(lognε2)\tilde{\Theta}(\sqrt{\log n} \cdot \varepsilon^{-2}) for the query complexity, improving upon both the known upper and lower bounds in [CRS15].

View on arXiv
Comments on this paper