Finite-Sample Concentration of the Multinomial in Relative Entropy

We show that the moment generating function of the Kullback-Leibler divergence (relative entropy) between the empirical distribution of independent samples from a distribution over a finite alphabet of size (i.e. a multinomial distribution) and itself is no more than that of a gamma distribution with shape and rate . The resulting exponential concentration inequality becomes meaningful (less than 1) when the divergence is larger than , whereas the standard method of types bound requires , thus saving a factor of order in the standard regime of parameters where . As a consequence, we also obtain finite-sample bounds on all the moments of the empirical divergence (equivalently, the discrete likelihood-ratio statistic), which are within constant factors (depending on the moment) of their asymptotic values. Our proof proceeds via a simple reduction to the case of a binary alphabet (i.e. a binomial distribution), and has the property that improvements in the case of directly translate to improvements for general . In particular, we conjecture a bound on the binomial moment generating function that would almost close the quadratic gap between our finite-sample bound and the asymptotic moment generating function bound from Wilks' theorem (which does not hold for finite samples).
View on arXiv