52
4

Concentration of the multinomial in Kullback-Leibler divergence near the ratio of alphabet and sample sizes

Abstract

We bound the moment generating function of the Kullback-Leibler divergence between the empirical distribution of independent samples from a distribution over a finite alphabet (e.g. a multinomial distribution) and the underlying distribution via a simple reduction to the case of a binary alphabet (e.g. a binomial distribution). The resulting concentration inequality becomes meaningful (less than 1) when the deviation ε\varepsilon is a constant factor larger than the ratio (k1)/n(k-1)/n for kk the alphabet size and nn the number of samples, whereas the standard method of types bound requires ε>(k1)/nlog(1+n/(k1))\varepsilon > (k-1)/n \cdot \log(1 + n/(k-1)).

View on arXiv
Comments on this paper