52
13

Information Theory of Penalized Likelihoods and its Statistical Implications

Abstract

We extend the correspondence between two-stage coding procedures in data compression and penalized likelihood procedures in statistical estimation. Traditionally, this had required restriction to countable parameter spaces. We show how to extend this correspondence in the uncountable parameter case. Leveraging the description length interpretations of penalized likelihood procedures we devise new techniques to derive adaptive risk bounds of such procedures. We show that the existence of certain countable coverings of the parameter space implies adaptive risk bounds and thus our theory is quite general. We apply our techniques to illustrate risk bounds for 1\ell_1 type penalized procedures in canonical high dimensional statistical problems such as linear regression and Gaussian graphical Models. In the linear regression problem, we also demonstrate how the traditional l0l_0 penalty times log(n)2\frac{\log(n)}{2} plus lower order terms has a two stage description length interpretation and present risk bounds for this penalized likelihood procedure.

View on arXiv
Comments on this paper