14
4

Finite-Sample Concentration of the Multinomial in Relative Entropy

Abstract

We show that the moment generating function of the Kullback-Leibler divergence (relative entropy) between the empirical distribution of nn independent samples from a distribution PP over a finite alphabet of size kk (i.e. a multinomial distribution) and PP itself is no more than that of a gamma distribution with shape k1k - 1 and rate nn. The resulting exponential concentration inequality becomes meaningful (less than 1) when the divergence ε\varepsilon is larger than (k1)/n(k-1)/n, whereas the standard method of types bound requires ε>1nlog(n+k1k1)(k1)/nlog(1+n/(k1))\varepsilon > \frac{1}{n} \cdot \log{\binom{n+k-1}{k-1}} \geq (k-1)/n \cdot \log(1 + n/(k-1)), thus saving a factor of order log(n/k)\log(n/k) in the standard regime of parameters where nkn\gg k. As a consequence, we also obtain finite-sample bounds on all the moments of the empirical divergence (equivalently, the discrete likelihood-ratio statistic), which are within constant factors (depending on the moment) of their asymptotic values. Our proof proceeds via a simple reduction to the case k=2k = 2 of a binary alphabet (i.e. a binomial distribution), and has the property that improvements in the case of k=2k = 2 directly translate to improvements for general kk. In particular, we conjecture a bound on the binomial moment generating function that would almost close the quadratic gap between our finite-sample bound and the asymptotic moment generating function bound from Wilks' theorem (which does not hold for finite samples).

View on arXiv
Comments on this paper