8
53

Concentration Inequalities for the Empirical Distribution

Abstract

We study concentration inequalities for the Kullback--Leibler (KL) divergence between the empirical distribution and the true distribution. Applying a recursion technique, we improve over the method of types bound uniformly in all regimes of sample size nn and alphabet size kk, and the improvement becomes more significant when kk is large. We discuss the applications of our results in obtaining tighter concentration inequalities for L1L_1 deviations of the empirical distribution from the true distribution, and the difference between concentration around the expectation or zero. We also obtain asymptotically tight bounds on the variance of the KL divergence between the empirical and true distribution, and demonstrate their quantitatively different behaviors between small and large sample sizes compared to the alphabet size.

View on arXiv
Comments on this paper