ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1508.02757
24
196

De-biasing the Lasso: Optimal Sample Size for Gaussian Designs

11 August 2015
Adel Javanmard
Andrea Montanari
ArXivPDFHTML
Abstract

Performing statistical inference in high-dimension is an outstanding challenge. A major source of difficulty is the absence of precise information on the distribution of high-dimensional estimators. Here, we consider linear regression in the high-dimensional regime p≫np\gg np≫n. In this context, we would like to perform inference on a high-dimensional parameters vector θ∗∈Rp\theta^*\in{\mathbb R}^pθ∗∈Rp. Important progress has been achieved in computing confidence intervals for single coordinates θi∗\theta^*_iθi∗​. A key role in these new methods is played by a certain debiased estimator θ^d\hat{\theta}^{\rm d}θ^d that is constructed from the Lasso. Earlier work establishes that, under suitable assumptions on the design matrix, the coordinates of θ^d\hat{\theta}^{\rm d}θ^d are asymptotically Gaussian provided θ∗\theta^*θ∗ is s0s_0s0​-sparse with s0=o(n/log⁡p)s_0 = o(\sqrt{n}/\log p )s0​=o(n​/logp). The condition s0=o(n/log⁡p)s_0 = o(\sqrt{n}/ \log p )s0​=o(n​/logp) is stronger than the one for consistent estimation, namely s0=o(n/log⁡p)s_0 = o(n/ \log p)s0​=o(n/logp). We study Gaussian designs with known or unknown population covariance. When the covariance is known, we prove that the debiased estimator is asymptotically Gaussian under the nearly optimal condition s0=o(n/(log⁡p)2)s_0 = o(n/ (\log p)^2)s0​=o(n/(logp)2). Note that earlier work was limited to s0=o(n/log⁡p)s_0 = o(\sqrt{n}/\log p)s0​=o(n​/logp) even for perfectly known covariance. The same conclusion holds if the population covariance is unknown but can be estimated sufficiently well, e.g. under the same sparsity conditions on the inverse covariance as assumed by earlier work. For intermediate regimes, we describe the trade-off between sparsity in the coefficients and in the inverse covariance of the design. We further discuss several applications of our results to high-dimensional inference. In particular, we propose a new estimator that is minimax optimal up to a factor 1+on(1)1+o_n(1)1+on​(1) for i.i.d. Gaussian designs.

View on arXiv
Comments on this paper