Tolerant Distribution Testing in the Conditional Sampling Model

ACM-SIAM Symposium on Discrete Algorithms (SODA), 2020

20 July 2020

Abstract

Recently, there has been significant work studying distribution testing under the Conditional Sampling model. In this model, a query involves specifying a subset $S$ of the domain, and the output received is a sample drawn from the distribution conditioned on being in $S$ . In this paper, we primarily study the \emph{tolerant} versions of the classic \emph{uniformity} and \emph{identity} testing problems, providing improved query complexity bounds in the conditional sampling model. In this paper, we prove that tolerant uniformity testing in the conditional sampling model can be solved using $\tilde{O}(\varepsilon^{-2})$ queries, which is known to be optimal and improves upon the $\tilde{O}(\varepsilon^{-20})$ -query algorithm of [CRS15]. Our bound even holds under the Pair Conditional Sampling model, a restricted version of the conditional sampling model where every queried subset $S$ either must have exactly $2$ elements, or must be the entire domain of the distribution. We also prove that tolerant identity testing in the conditional sampling model can be solved in $\tilde{O}(\varepsilon^{-4})$ queries, which is the first known bound independent of the support size of the distribution for this problem. Finally, we study (non-tolerant) identity testing under the pair conditional sampling model, and provide a tight bound of $\tilde{\Theta}(\sqrt{\log n} \cdot \varepsilon^{-2})$ for the query complexity, improving upon both the known upper and lower bounds in [CRS15].

View on arXiv

Comments on this paper