33
3

LASSO risk and phase transition under dependence

Abstract

We consider the problem of recovering a kk-sparse signal {\mbox{\beta}}_0\in\mathbb{R}^p from noisy observations \bf y={\bf X}\mbox{\beta}_0+{\bf w}\in\mathbb{R}^n. One of the most popular approaches is the l1l_1-regularized least squares, also known as LASSO. We analyze the mean square error of LASSO in the case of random designs in which each row of X{\bf X} is drawn from distribution N(0,{\mbox{\Sigma}}) with general {\mbox{\Sigma}}. We first derive the asymptotic risk of LASSO in the limit of n,pn,p\rightarrow\infty with n/pδn/p\rightarrow\delta. We then examine conditions on nn, pp, and kk for LASSO to exactly reconstruct {\mbox{\beta}}_0 in the noiseless case w=0{\bf w}=0. A phase boundary δc=δ(ϵ)\delta_c=\delta(\epsilon) is precisely established in the phase space defined by 0δ,ϵ10\le\delta,\epsilon\le 1, where ϵ=k/p\epsilon=k/p. Above this boundary, LASSO perfectly recovers {\mbox{\beta}}_0 with high probability. Below this boundary, LASSO fails to recover \mbox{\beta}_0 with high probability. While the values of the non-zero elements of {\mbox{\beta}}_0 do not have any effect on the phase transition curve, our analysis shows that δc\delta_c does depend on the signed pattern of the nonzero values of \mbox{\beta}_0 for general {\mbox{\Sigma}}\ne{\bf I}_p. This is in sharp contrast to the previous phase transition results derived in i.i.d. case with \mbox{\Sigma}={\bf I}_p where δc\delta_c is completely determined by ϵ\epsilon regardless of the distribution of \mbox{\beta}_0. Underlying our formalism is a recently developed efficient algorithm called approximate message passing (AMP) algorithm. We generalize the state evolution of AMP from i.i.d. case to general case with {\mbox{\Sigma}}\ne{\bf I}_p. Extensive computational experiments confirm that our theoretical predictions are consistent with simulation results on moderate size system.

View on arXiv
Comments on this paper