ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1707.05662
19
0

Learning Powers of Poisson Binomial Distributions

18 July 2017
Dimitris Fotakis
Vasilis Kontonis
Piotr Krysta
P. Spirakis
ArXiv (abs)PDFHTML
Abstract

We introduce the problem of simultaneously learning all powers of a Poisson Binomial Distribution (PBD). A PBD of order nnn is the distribution of a sum of nnn mutually independent Bernoulli random variables XiX_iXi​, where E[Xi]=pi\mathbb{E}[X_i] = p_iE[Xi​]=pi​. The kkk'th power of this distribution, for kkk in a range [m][m][m], is the distribution of Pk=∑i=1nXi(k)P_k = \sum_{i=1}^n X_i^{(k)}Pk​=∑i=1n​Xi(k)​, where each Bernoulli random variable Xi(k)X_i^{(k)}Xi(k)​ has E[Xi(k)]=(pi)k\mathbb{E}[X_i^{(k)}] = (p_i)^kE[Xi(k)​]=(pi​)k. The learning algorithm can query any power PkP_kPk​ several times and succeeds in learning all powers in the range, if with probability at least 1−δ1- \delta1−δ: given any k∈[m]k \in [m]k∈[m], it returns a probability distribution QkQ_kQk​ with total variation distance from PkP_kPk​ at most ϵ\epsilonϵ. We provide almost matching lower and upper bounds on query complexity for this problem. We first show a lower bound on the query complexity on PBD powers instances with many distinct parameters pip_ipi​ which are separated, and we almost match this lower bound by examining the query complexity of simultaneously learning all the powers of a special class of PBD's resembling the PBD's of our lower bound. We study the fundamental setting of a Binomial distribution, and provide an optimal algorithm which uses O(1/ϵ2)O(1/\epsilon^2)O(1/ϵ2) samples. Diakonikolas, Kane and Stewart [COLT'16] showed a lower bound of Ω(21/ϵ)\Omega(2^{1/\epsilon})Ω(21/ϵ) samples to learn the pip_ipi​'s within error ϵ\epsilonϵ. The question whether sampling from powers of PBDs can reduce this sampling complexity, has a negative answer since we show that the exponential number of samples is inevitable. Having sampling access to the powers of a PBD we then give a nearly optimal algorithm that learns its pip_ipi​'s. To prove our two last lower bounds we extend the classical minimax risk definition from statistics to estimating functions of sequences of distributions.

View on arXiv
Comments on this paper