Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1912.02292
Cited By
Deep Double Descent: Where Bigger Models and More Data Hurt
4 December 2019
Preetum Nakkiran
Gal Kaplun
Yamini Bansal
Tristan Yang
Boaz Barak
Ilya Sutskever
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Double Descent: Where Bigger Models and More Data Hurt"
50 / 205 papers shown
Title
Sparse-IFT: Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
Vithursan Thangarasa
Shreyas Saxena
Abhay Gupta
Sean Lie
31
3
0
21 Mar 2023
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
38
4
0
20 Mar 2023
Deep Learning Weight Pruning with RMT-SVD: Increasing Accuracy and Reducing Overfitting
Yitzchak Shmalo
Jonathan Jenkins
Oleksii Krupchytskyi
30
3
0
15 Mar 2023
Unifying Grokking and Double Descent
Peter W. Battaglia
David Raposo
Kelsey
40
31
0
10 Mar 2023
Tradeoff of generalization error in unsupervised learning
Gilhan Kim
Ho-Jun Lee
Junghyo Jo
Yongjoo Baek
18
0
0
10 Mar 2023
DSD
2
^2
2
: Can We Dodge Sparse Double Descent and Compress the Neural Network Worry-Free?
Victor Quétu
Enzo Tartaglione
32
7
0
02 Mar 2023
Can we avoid Double Descent in Deep Neural Networks?
Victor Quétu
Enzo Tartaglione
AI4CE
20
3
0
26 Feb 2023
Generalization Bounds with Data-dependent Fractal Dimensions
Benjamin Dupuis
George Deligiannidis
Umut cSimcsekli
AI4CE
39
12
0
06 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels
Simone Bombari
Shayan Kiyani
Marco Mondelli
AAML
40
10
0
03 Feb 2023
Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data
Jonathan W. Siegel
14
2
0
02 Feb 2023
Pathologies of Predictive Diversity in Deep Ensembles
Taiga Abe
E. Kelly Buchanan
Geoff Pleiss
John P. Cunningham
UQCV
40
13
0
01 Feb 2023
On the Lipschitz Constant of Deep Networks and Double Descent
Matteo Gamba
Hossein Azizpour
Marten Bjorkman
31
7
0
28 Jan 2023
A Simple Algorithm For Scaling Up Kernel Methods
Tengyu Xu
Bryan Kelly
Semyon Malamud
16
0
0
26 Jan 2023
A Stability Analysis of Fine-Tuning a Pre-Trained Model
Z. Fu
Anthony Man-Cho So
Nigel Collier
23
3
0
24 Jan 2023
Strong inductive biases provably prevent harmless interpolation
Michael Aerni
Marco Milanta
Konstantin Donhauser
Fanny Yang
35
9
0
18 Jan 2023
WLD-Reg: A Data-dependent Within-layer Diversity Regularizer
Firas Laakom
Jenni Raitoharju
Alexandros Iosifidis
Moncef Gabbouj
AI4CE
29
7
0
03 Jan 2023
Problem-Dependent Power of Quantum Neural Networks on Multi-Class Classification
Yuxuan Du
Yibo Yang
Dacheng Tao
Min-hsiu Hsieh
41
23
0
29 Dec 2022
Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures
Antione Bodin
N. Macris
34
4
0
13 Dec 2022
Reliable extrapolation of deep neural operators informed by physics or sparse observations
Min Zhu
Handi Zhang
Anran Jiao
George Karniadakis
Lu Lu
50
91
0
13 Dec 2022
Leveraging Unlabeled Data to Track Memorization
Mahsa Forouzesh
Hanie Sedghi
Patrick Thiran
NoLa
TDI
34
4
0
08 Dec 2022
Task Discovery: Finding the Tasks that Neural Networks Generalize on
Andrei Atanov
Andrei Filatov
Teresa Yeo
Ajay Sohmshetty
Amir Zamir
OOD
45
10
0
01 Dec 2022
A Survey of Learning Curves with Bad Behavior: or How More Data Need Not Lead to Better Performance
Marco Loog
T. Viering
26
1
0
25 Nov 2022
PAC-Bayes Compression Bounds So Tight That They Can Explain Generalization
Sanae Lotfi
Marc Finzi
Sanyam Kapoor
Andres Potapczynski
Micah Goldblum
A. Wilson
BDL
MLT
AI4CE
29
51
0
24 Nov 2022
Novel transfer learning schemes based on Siamese networks and synthetic data
Dominik Stallmann
Philip Kenneweg
Barbara Hammer
18
6
0
21 Nov 2022
Understanding the double descent curve in Machine Learning
Luis Sa-Couto
J. M. Ramos
Miguel Almeida
Andreas Wichert
35
1
0
18 Nov 2022
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
29
24
0
10 Nov 2022
A Solvable Model of Neural Scaling Laws
A. Maloney
Daniel A. Roberts
J. Sully
36
51
0
30 Oct 2022
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
30
74
0
26 Oct 2022
Grokking phase transitions in learning local rules with gradient descent
Bojan Žunkovič
E. Ilievski
63
16
0
26 Oct 2022
SGD with Large Step Sizes Learns Sparse Features
Maksym Andriushchenko
Aditya Varre
Loucas Pillaud-Vivien
Nicolas Flammarion
45
56
0
11 Oct 2022
The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels
Daniel Shwartz
Uri Stern
D. Weinshall
NoLa
33
2
0
02 Oct 2022
The Minority Matters: A Diversity-Promoting Collaborative Metric Learning Algorithm
Shilong Bao
Qianqian Xu
Zhiyong Yang
Yuan He
Xiaochun Cao
Qingming Huang
38
8
0
30 Sep 2022
On the Impossible Safety of Large AI Models
El-Mahdi El-Mhamdi
Sadegh Farhadkhani
R. Guerraoui
Nirupam Gupta
L. Hoang
Rafael Pinot
Sébastien Rouault
John Stephan
30
31
0
30 Sep 2022
Understanding Collapse in Non-Contrastive Siamese Representation Learning
Alexander C. Li
Alexei A. Efros
Deepak Pathak
SSL
53
33
0
29 Sep 2022
Bayesian Neural Network Versus Ex-Post Calibration For Prediction Uncertainty
Satya Borgohain
Klaus Ackermann
Rubén Loaiza-Maya
BDL
UQCV
13
0
0
29 Sep 2022
Neural parameter calibration for large-scale multi-agent models
Thomas Gaskin
G. Pavliotis
Mark Girolami
AI4TS
26
23
0
27 Sep 2022
Why neural networks find simple solutions: the many regularizers of geometric complexity
Benoit Dherin
Michael Munn
M. Rosca
David Barrett
55
31
0
27 Sep 2022
In-context Learning and Induction Heads
Catherine Olsson
Nelson Elhage
Neel Nanda
Nicholas Joseph
Nova Dassarma
...
Tom B. Brown
Jack Clark
Jared Kaplan
Sam McCandlish
C. Olah
250
463
0
24 Sep 2022
Deep Double Descent via Smooth Interpolation
Matteo Gamba
Erik Englesson
Marten Bjorkman
Hossein Azizpour
63
11
0
21 Sep 2022
Importance Tempering: Group Robustness for Overparameterized Models
Yiping Lu
Wenlong Ji
Zachary Izzo
Lexing Ying
42
7
0
19 Sep 2022
Information FOMO: The unhealthy fear of missing out on information. A method for removing misleading data for healthier models
Ethan Pickering
T. Sapsis
24
6
0
27 Aug 2022
Learning Hyper Label Model for Programmatic Weak Supervision
Renzhi Wu
Sheng Chen
Jieyu Zhang
Xu Chu
26
16
0
27 Jul 2022
The BUTTER Zone: An Empirical Study of Training Dynamics in Fully Connected Neural Networks
Charles Edison Tripp
J. Perr-Sauer
L. Hayne
M. Lunacek
Jamil Gafur
AI4CE
21
0
0
25 Jul 2022
Sparse Double Descent: Where Network Pruning Aggravates Overfitting
Zhengqi He
Zeke Xie
Quanzhi Zhu
Zengchang Qin
77
27
0
17 Jun 2022
Learning Uncertainty with Artificial Neural Networks for Improved Predictive Process Monitoring
Hans Weytjens
Jochen De Weerdt
19
17
0
13 Jun 2022
Towards Understanding Sharpness-Aware Minimization
Maksym Andriushchenko
Nicolas Flammarion
AAML
35
134
0
13 Jun 2022
Overcoming the Spectral Bias of Neural Value Approximation
Ge Yang
Anurag Ajay
Pulkit Agrawal
34
25
0
09 Jun 2022
FIFA: Making Fairness More Generalizable in Classifiers Trained on Imbalanced Data
Zhun Deng
Jiayao Zhang
Linjun Zhang
Ting Ye
Yates Coley
Weijie J. Su
James Zou
43
16
0
06 Jun 2022
Regularization-wise double descent: Why it occurs and how to eliminate it
Fatih Yilmaz
Reinhard Heckel
30
11
0
03 Jun 2022
A Blessing of Dimensionality in Membership Inference through Regularization
Jasper Tan
Daniel LeJeune
Blake Mason
Hamid Javadi
Richard G. Baraniuk
32
18
0
27 May 2022
Previous
1
2
3
4
5
Next