163
102

Maximum Likelihood Estimation of Functionals of Discrete Distributions

Abstract

The Maximum Likelihood Estimator (MLE) is widely used in estimating functionals of discrete probability distributions, and involves "plugging-in" the empirical distribution of the data. In this work we propose a general framework and procedure to analyze the performance of the MLE in estimating functionals of discrete distributions, under the worst-case mean squared error criterion. In particular, we use approximation theory to bound the bias incurred by the MLE, and concentration inequalities to bound the variance. We highlight our techniques by considering two important information measures: the entropy, and the R\'enyi entropy of order α\alpha. For entropy estimation, we show that it is necessary and sufficient to have n=ω(S)n = \omega(S) observations for the MLE to be consistent, where SS represents the alphabet size. In addition, we obtain that it is necessary and sufficient to consider n=ω(S1/α)n = \omega(S^{1/\alpha}) samples for the MLE to consistently estimate i=1Spiα,0<α<1\sum_{i = 1}^S p_i^\alpha, 0<\alpha<1. For both these problems, the MLE achieves the best possible sample complexity up to logarithmic factors. When α>1\alpha>1, we show that n=ω(max(S2/α1,1))n = \omega( \max ( S^{2/\alpha - 1}, 1) ) samples suffice.

View on arXiv
Comments on this paper