ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1903.08560
204
746
v1v2v3v4v5 (latest)

Surprises in High-Dimensional Ridgeless Least Squares Interpolation

19 March 2019
Trevor Hastie
Andrea Montanari
Saharon Rosset
Robert Tibshirani
ArXiv (abs)PDFHTML
Abstract

Interpolators---estimators that achieve zero training error---have attracted growing attention in machine learning, mainly because state-of-the art neural networks appear to be models of this type. In this paper, we study minimum ℓ2\ell_2ℓ2​ norm ("ridgeless") interpolation in high-dimensional least squares regression. We consider two different models for the feature distribution: a linear model, where the feature vectors xi∈Rpx_i \in \mathbb{R}^pxi​∈Rp are obtained by applying a linear transform to a vector of i.i.d. entries, xi=Σ1/2zix_i = \Sigma^{1/2} z_ixi​=Σ1/2zi​ (with zi∈Rpz_i \in \mathbb{R}^pzi​∈Rp); and a nonlinear model, where the feature vectors are obtained by passing the input through a random one-layer neural network, xi=φ(Wzi)x_i = \varphi(W z_i)xi​=φ(Wzi​) (with zi∈Rdz_i \in \mathbb{R}^dzi​∈Rd, W∈Rp×dW \in \mathbb{R}^{p \times d}W∈Rp×d a matrix of i.i.d. entries, and φ\varphiφ an activation function acting componentwise on WziW z_iWzi​). We recover---in a precise quantitative way---several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization.

View on arXiv
Comments on this paper