15
138

The phase transition for the existence of the maximum likelihood estimate in high-dimensional logistic regression

Emmanuel J. Candes
Pragya Sur
Abstract

This paper rigorously establishes that the existence of the maximum likelihood estimate (MLE) in high-dimensional logistic regression models with Gaussian covariates undergoes a sharp `phase transition'. We introduce an explicit boundary curve hMLEh_{\text{MLE}}, parameterized by two scalars measuring the overall magnitude of the unknown sequence of regression coefficients, with the following property: in the limit of large sample sizes nn and number of features pp proportioned in such a way that p/nκp/n \rightarrow \kappa, we show that if the problem is sufficiently high dimensional in the sense that κ>hMLE\kappa > h_{\text{MLE}}, then the MLE does not exist with probability one. Conversely, if κ<hMLE\kappa < h_{\text{MLE}}, the MLE asymptotically exists with probability one.

View on arXiv
Comments on this paper