70
3
v1v2 (latest)

Learning Sums of Independent Random Variables with Sparse Collective Support

Abstract

We study the learnability of sums of independent integer random variables given a bound on the size of the union of their supports. For AZ+\mathcal{A} \subset \mathbf{Z}_{+}, a sum of independent random variables with collective support A\mathcal{A}} (called an A\mathcal{A}-sum in this paper) is a distribution S=X1++XN\mathbf{S} = \mathbf{X}_1 + \cdots + \mathbf{X}_N where the Xi\mathbf{X}_i's are mutually independent (but not necessarily identically distributed) integer random variables with isupp(Xi)A.\cup_i \mathsf{supp}(\mathbf{X}_i) \subseteq \mathcal{A}. We give two main algorithmic results for learning such distributions: 1. For the case A=3| \mathcal{A} | = 3, we give an algorithm for learning A\mathcal{A}-sums to accuracy ϵ\epsilon that uses poly(1/ϵ)\mathsf{poly}(1/\epsilon) samples and runs in time poly(1/ϵ)\mathsf{poly}(1/\epsilon), independent of NN and of the elements of A\mathcal{A}. 2. For an arbitrary constant k4k \geq 4, if A={a1,...,ak}\mathcal{A} = \{ a_1,...,a_k\} with 0a1<...<ak0 \leq a_1 < ... < a_k, we give an algorithm that uses poly(1/ϵ)loglogak\mathsf{poly}(1/\epsilon) \cdot \log \log a_k samples (independent of NN) and runs in time poly(1/ϵ,logak).\mathsf{poly}(1/\epsilon, \log a_k). We prove an essentially matching lower bound: if A=4|\mathcal{A}| = 4, then any algorithm must use Ω(logloga4)\Omega(\log \log a_4) samples even for learning to constant accuracy. We also give similar-in-spirit (but quantitatively very different) algorithmic results, and essentially matching lower bounds, for the case in which A\mathcal{A} is not known to the learner.

View on arXiv
Comments on this paper