Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.15141
Cited By
v1
v2
v3 (latest)
From Tempered to Benign Overfitting in ReLU Neural Networks
24 May 2023
Guy Kornowski
Gilad Yehudai
Ohad Shamir
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"From Tempered to Benign Overfitting in ReLU Neural Networks"
45 / 45 papers shown
Title
Quantifying Overfitting along the Regularization Path for Two-Part-Code MDL in Supervised Classification
Xiaohan Zhu
Nathan Srebro
92
0
0
03 Mar 2025
Noisy Interpolation Learning with Shallow Univariate ReLU Networks
Nirmit Joshi
Gal Vardi
Nathan Srebro
95
8
0
28 Jul 2023
Benign Overfitting in Linear Classifiers and Leaky ReLU Networks from KKT Conditions for Margin Maximization
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
75
23
0
02 Mar 2023
Penalising the biases in norm regularisation enforces sparsity
Etienne Boursier
Nicolas Flammarion
115
17
0
02 Mar 2023
Interpolation Learning With Minimum Description Length
N. Manoj
Nathan Srebro
49
4
0
14 Feb 2023
Deep Linear Networks can Benignly Overfit when Shallow Ones Do
Niladri S. Chatterji
Philip M. Long
72
8
0
19 Sep 2022
On the Implicit Bias in Deep-Learning Algorithms
Gal Vardi
FedML
AI4CE
71
81
0
26 Aug 2022
Max-Margin Works while Large Margin Fails: Generalization without Uniform Convergence
Margalit Glasgow
Colin Wei
Mary Wootters
Tengyu Ma
90
5
0
16 Jun 2022
On the Inconsistency of Kernel Ridgeless Regression in Fixed Dimensions
Daniel Beaglehole
M. Belkin
Parthe Pandit
60
11
0
26 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
95
24
0
18 May 2022
Benign Overfitting in Two-layer Convolutional Neural Networks
Yuan Cao
Zixiang Chen
M. Belkin
Quanquan Gu
MLT
82
89
0
14 Feb 2022
Benign Overfitting without Linearity: Neural Network Classifiers Trained by Gradient Descent for Noisy Linear Data
Spencer Frei
Niladri S. Chatterji
Peter L. Bartlett
MLT
107
75
0
11 Feb 2022
Optimistic Rates: A Unifying Theory for Interpolation Learning and Regularization in Linear Regression
Lijia Zhou
Frederic Koehler
Danica J. Sutherland
Nathan Srebro
131
25
0
08 Dec 2021
Understanding Square Loss in Training Overparametrized Neural Network Classifiers
Tianyang Hu
Jun Wang
Wei Cao
Zhenguo Li
UQCV
AAML
84
19
0
07 Dec 2021
Harmless interpolation in regression and classification with structured features
Andrew D. McRae
Santhosh Karnik
Mark A. Davenport
Vidya Muthukumar
175
11
0
09 Nov 2021
On Margin Maximization in Linear and ReLU Networks
Gal Vardi
Ohad Shamir
Nathan Srebro
127
30
0
06 Oct 2021
Benign Overfitting in Multiclass Classification: All Roads Lead to Interpolation
Ke Wang
Vidya Muthukumar
Christos Thrampoulidis
75
49
0
21 Jun 2021
Uniform Convergence of Interpolators: Gaussian Width, Norm Bounds, and Benign Overfitting
Frederic Koehler
Lijia Zhou
Danica J. Sutherland
Nathan Srebro
76
57
0
17 Jun 2021
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
M. Belkin
53
186
0
29 May 2021
Uniform Convergence, Adversarial Spheres and a Simple Remedy
Gregor Bachmann
Seyed-Mohsen Moosavi-Dezfooli
Thomas Hofmann
AAML
38
8
0
07 May 2021
Risk Bounds for Over-parameterized Maximum Margin Classification on Sub-Gaussian Mixtures
Yuan Cao
Quanquan Gu
M. Belkin
63
53
0
28 Apr 2021
Deep learning: a statistical viewpoint
Peter L. Bartlett
Andrea Montanari
Alexander Rakhlin
70
279
0
16 Mar 2021
Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models
Zitong Yang
Yu Bai
Song Mei
60
18
0
08 Mar 2021
Interpolating Classifiers Make Few Mistakes
Tengyuan Liang
Benjamin Recht
55
28
0
28 Jan 2021
Theoretical Insights Into Multiclass Classification: A High-dimensional Asymptotic View
Christos Thrampoulidis
Samet Oymak
Mahdi Soltanolkotabi
69
43
0
16 Nov 2020
Failures of model-dependent generalization bounds for least-norm interpolation
Peter L. Bartlett
Philip M. Long
148
29
0
16 Oct 2020
Distributional Generalization: A New Kind of Generalization
Preetum Nakkiran
Yamini Bansal
OOD
70
42
0
17 Sep 2020
Directional convergence and alignment in deep learning
Ziwei Ji
Matus Telgarsky
66
171
0
11 Jun 2020
Classification vs regression in overparameterized regimes: Does the loss function matter?
Vidya Muthukumar
Adhyyan Narang
Vignesh Subramanian
M. Belkin
Daniel J. Hsu
A. Sahai
103
151
0
16 May 2020
Finite-sample Analysis of Interpolating Linear Classifiers in the Overparameterized Regime
Niladri S. Chatterji
Philip M. Long
83
109
0
25 Apr 2020
In Defense of Uniform Convergence: Generalization via derandomization with an application to interpolating predictors
Jeffrey Negrea
Gintare Karolina Dziugaite
Daniel M. Roy
AI4CE
74
65
0
09 Dec 2019
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
103
639
0
14 Aug 2019
Benign Overfitting in Linear Regression
Peter L. Bartlett
Philip M. Long
Gábor Lugosi
Alexander Tsigler
MLT
105
779
0
26 Jun 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
98
336
0
13 Jun 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Trevor Hastie
Andrea Montanari
Saharon Rosset
Robert Tibshirani
222
747
0
19 Mar 2019
Two models of double descent for weak features
M. Belkin
Daniel J. Hsu
Ji Xu
117
375
0
18 Mar 2019
How do infinite width bounded norm networks look in function space?
Pedro H. P. Savarese
Itay Evron
Daniel Soudry
Nathan Srebro
85
166
0
13 Feb 2019
Uniform convergence may be unable to explain generalization in deep learning
Vaishnavh Nagarajan
J. Zico Kolter
MoMe
AI4CE
86
317
0
13 Feb 2019
Consistency of Interpolation with Laplace Kernels is a High-Dimensional Phenomenon
Alexander Rakhlin
Xiyu Zhai
112
79
0
28 Dec 2018
Reconciling modern machine learning practice and the bias-variance trade-off
M. Belkin
Daniel J. Hsu
Siyuan Ma
Soumik Mandal
247
1,659
0
28 Dec 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize
Tengyuan Liang
Alexander Rakhlin
89
355
0
01 Aug 2018
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate
M. Belkin
Daniel J. Hsu
P. Mitra
AI4CE
153
259
0
13 Jun 2018
To understand deep learning we need to understand kernel learning
M. Belkin
Siyuan Ma
Soumik Mandal
75
420
0
05 Feb 2018
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
174
924
0
27 Oct 2017
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
351
4,636
0
10 Nov 2016
1