ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1503.03188
20
43

Optimal prediction for sparse linear models? Lower bounds for coordinate-separable M-estimators

11 March 2015
Yuchen Zhang
Martin J. Wainwright
Michael I. Jordan
ArXivPDFHTML
Abstract

For the problem of high-dimensional sparse linear regression, it is known that an ℓ0\ell_0ℓ0​-based estimator can achieve a 1/n1/n1/n "fast" rate on the prediction error without any conditions on the design matrix, whereas in absence of restrictive conditions on the design matrix, popular polynomial-time methods only guarantee the 1/n1/\sqrt{n}1/n​ "slow" rate. In this paper, we show that the slow rate is intrinsic to a broad class of M-estimators. In particular, for estimators based on minimizing a least-squares cost function together with a (possibly non-convex) coordinate-wise separable regularizer, there is always a "bad" local optimum such that the associated prediction error is lower bounded by a constant multiple of 1/n1/\sqrt{n}1/n​. For convex regularizers, this lower bound applies to all global optima. The theory is applicable to many popular estimators, including convex ℓ1\ell_1ℓ1​-based methods as well as M-estimators based on nonconvex regularizers, including the SCAD penalty or the MCP regularizer. In addition, for a broad class of nonconvex regularizers, we show that the bad local optima are very common, in that a broad class of local minimization algorithms with random initialization will typically converge to a bad solution.

View on arXiv
Comments on this paper