167
177

High-Dimensional Graphical Model Selection Using 1\ell_1-Regularized Logistic Regression

Abstract

We consider the problem of estimating the graph structure associated with a discrete Markov random field. We describe a method based on 1\ell_1-regularized logistic regression, in which the neighborhood of any given node is estimated by performing logistic regression subject to an 1\ell_1-constraint. Our framework applies to the high-dimensional setting, in which both the number of nodes pp and maximum neighborhood sizes dd are allowed to grow as a function of the number of observations nn. Our main results provide sufficient conditions on the triple (n,p,d)(n, p, d) for the method to succeed in consistently estimating the neighborhood of every node in the graph simultaneously. Under certain assumptions on the population Fisher information matrix, we prove that consistent neighborhood selection can be obtained for sample sizes n=Ω(d3logp)n = \Omega(d^3 \log p), with the error decaying as \order(exp(Cn/d3))\order(\exp(-C n/d^3)) for some constant CC. If these same assumptions are imposed directly on the sample matrices, we show that n=Ω(d2logp)n = \Omega(d^2 \log p) samples are sufficient.

View on arXiv
Comments on this paper