61
23

Communication Complexity of Estimating Correlations

Abstract

We characterize the communication complexity of the following distributed estimation problem. Alice and Bob observe infinitely many iid copies of ρ\rho-correlated unit-variance (Gaussian or ±1\pm1 binary) random variables, with unknown ρ[1,1]\rho\in[-1,1]. By interactively exchanging kk bits, Bob wants to produce an estimate ρ^\hat\rho of ρ\rho. We show that the best possible performance (optimized over interaction protocol Π\Pi and estimator ρ^\hat \rho) satisfies infΠ,ρ^supρE[ρρ^2]=Θ(1k)\inf_{\Pi,\hat\rho}\sup_\rho \mathbb{E} [|\rho-\hat\rho|^2] = \Theta(\tfrac{1}{k}). Furthermore, we show that the best possible unbiased estimator achieves performance of 1+o(1)2kln21+o(1)\over {2k\ln 2}. Curiously, thus, restricting communication to kk bits results in (order-wise) similar minimax estimation error as restricting to kk samples. Our results also imply an Ω(n)\Omega(n) lower bound on the information complexity of the Gap-Hamming problem, for which we show a direct information-theoretic proof. Notably, the protocol achieving (almost) optimal performance is one-way (non-interactive). For one-way protocols we also prove the Ω(1k)\Omega(\tfrac{1}{k}) bound even when ρ\rho is restricted to any small open sub-interval of [1,1][-1,1] (i.e. a local minimax lower bound). %We do not know if this local behavior remains true in the interactive setting. Our proof techniques rely on symmetric strong data-processing inequalities, various tensorization techniques from information-theoretic interactive common-randomness extraction, and (for the local lower bound) on the Otto-Villani estimate for the Wasserstein-continuity of trajectories of the Ornstein-Uhlenbeck semigroup.

View on arXiv
Comments on this paper