v1v2v3 (latest)

From Tempered to Benign Overfitting in ReLU Neural Networks

24 May 2023

Papers citing "From Tempered to Benign Overfitting in ReLU Neural Networks"

45 / 45 papers shown

Title
Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification Xiaohan Zhu Nathan Srebro 92 0 0 03 Mar 2025
Noisy Interpolation Learning with Shallow Univariate ReLU Networks Nirmit Joshi Gal Vardi Nathan Srebro 95 8 0 28 Jul 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization Spencer Frei Gal Vardi Peter L. Bartlett Nathan Srebro 75 23 0 02 Mar 2023
Penalising the biases in norm regularisation enforces sparsity Etienne Boursier Nicolas Flammarion 115 17 0 02 Mar 2023
Interpolation Learning With Minimum Description Length N. Manoj Nathan Srebro 49 4 0 14 Feb 2023
Deep Linear Networks can Benignly Overfit when Shallow Ones Do Niladri S. Chatterji Philip M. Long 72 8 0 19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms Gal Vardi FedML AI4CE 71 81 0 26 Aug 2022
Max-Margin Works while Large Margin Fails: Generalization without Uniform Convergence Margalit Glasgow Colin Wei Mary Wootters Tengyu Ma 90 5 0 16 Jun 2022
On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions Daniel Beaglehole M. Belkin Parthe Pandit 60 11 0 26 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 95 24 0 18 May 2022
Benign Overfitting in Two-layer Convolutional Neural Networks Yuan Cao Zixiang Chen M. Belkin Quanquan Gu MLT 82 89 0 14 Feb 2022
Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data Spencer Frei Niladri S. Chatterji Peter L. Bartlett MLT 107 75 0 11 Feb 2022
Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression Lijia Zhou Frederic Koehler Danica J. Sutherland Nathan Srebro 131 25 0 08 Dec 2021
Understanding Square Loss in Training Overparametrized Neural Network Classifiers Tianyang Hu Jun Wang Wei Cao Zhenguo Li UQCV AAML 84 19 0 07 Dec 2021
Harmless interpolation in regression and classification with structured features Andrew D. McRae Santhosh Karnik Mark A. Davenport Vidya Muthukumar 175 11 0 09 Nov 2021
On Margin Maximization in Linear and ReLU Networks Gal Vardi Ohad Shamir Nathan Srebro 127 30 0 06 Oct 2021
Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation Ke Wang Vidya Muthukumar Christos Thrampoulidis 75 49 0 21 Jun 2021
Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds, and Benign Overfitting Frederic Koehler Lijia Zhou Danica J. Sutherland Nathan Srebro 76 57 0 17 Jun 2021
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation M. Belkin 53 186 0 29 May 2021
Uniform Convergence, Adversarial Spheres and a Simple Remedy Gregor Bachmann Seyed-Mohsen Moosavi-Dezfooli Thomas Hofmann AAML 38 8 0 07 May 2021
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures Yuan Cao Quanquan Gu M. Belkin 63 53 0 28 Apr 2021
Deep learning: a statistical viewpoint Peter L. Bartlett Andrea Montanari Alexander Rakhlin 70 279 0 16 Mar 2021
Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models Zitong Yang Yu Bai Song Mei 60 18 0 08 Mar 2021
Interpolating Classifiers Make Few Mistakes Tengyuan Liang Benjamin Recht 55 28 0 28 Jan 2021
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View Christos Thrampoulidis Samet Oymak Mahdi Soltanolkotabi 69 43 0 16 Nov 2020
Failures of model-dependent generalization bounds for least-norm interpolation Peter L. Bartlett Philip M. Long 148 29 0 16 Oct 2020
Distributional Generalization: A New Kind of Generalization Preetum Nakkiran Yamini Bansal OOD 70 42 0 17 Sep 2020
Directional convergence and alignment in deep learning Ziwei Ji Matus Telgarsky 66 171 0 11 Jun 2020
Classification vs regression in overparameterized regimes: Does the loss function matter? Vidya Muthukumar Adhyyan Narang Vignesh Subramanian M. Belkin Daniel J. Hsu A. Sahai 103 151 0 16 May 2020
Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime Niladri S. Chatterji Philip M. Long 83 109 0 25 Apr 2020
In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors Jeffrey Negrea Gintare Karolina Dziugaite Daniel M. Roy AI4CE 74 65 0 09 Dec 2019
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 103 639 0 14 Aug 2019
Benign Overfitting in Linear Regression Peter L. Bartlett Philip M. Long Gábor Lugosi Alexander Tsigler MLT 105 779 0 26 Jun 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks Kaifeng Lyu Jian Li 98 336 0 13 Jun 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation Trevor Hastie Andrea Montanari Saharon Rosset Robert Tibshirani 222 747 0 19 Mar 2019
Two models of double descent for weak features M. Belkin Daniel J. Hsu Ji Xu 117 375 0 18 Mar 2019
How do infinite width bounded norm networks look in function space? Pedro H. P. Savarese Itay Evron Daniel Soudry Nathan Srebro 85 166 0 13 Feb 2019
Uniform convergence may be unable to explain generalization in deep learning Vaishnavh Nagarajan J. Zico Kolter MoMe AI4CE 86 317 0 13 Feb 2019
Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon Alexander Rakhlin Xiyu Zhai 112 79 0 28 Dec 2018
Reconciling modern machine learning practice and the bias-variance trade-off M. Belkin Daniel J. Hsu Siyuan Ma Soumik Mandal 247 1,659 0 28 Dec 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize Tengyuan Liang Alexander Rakhlin 89 355 0 01 Aug 2018
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate M. Belkin Daniel J. Hsu P. Mitra AI4CE 153 259 0 13 Jun 2018
To understand deep learning we need to understand kernel learning M. Belkin Siyuan Ma Soumik Mandal 75 420 0 05 Feb 2018
The Implicit Bias of Gradient Descent on Separable Data Daniel Soudry Elad Hoffer Mor Shpigel Nacson Suriya Gunasekar Nathan Srebro 174 924 0 27 Oct 2017
Understanding deep learning requires rethinking generalization Chiyuan Zhang Samy Bengio Moritz Hardt Benjamin Recht Oriol Vinyals HAI 351 4,636 0 10 Nov 2016