ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1912.02292
  4. Cited By
Deep Double Descent: Where Bigger Models and More Data Hurt

Deep Double Descent: Where Bigger Models and More Data Hurt

4 December 2019
Preetum Nakkiran
Gal Kaplun
Yamini Bansal
Tristan Yang
Boaz Barak
Ilya Sutskever
ArXivPDFHTML

Papers citing "Deep Double Descent: Where Bigger Models and More Data Hurt"

50 / 205 papers shown
Title
Physics-Embedded Neural Networks: Graph Neural PDE Solvers with Mixed
  Boundary Conditions
Physics-Embedded Neural Networks: Graph Neural PDE Solvers with Mixed Boundary Conditions
Masanobu Horie
Naoto Mitsume
PINN
AI4CE
29
24
0
24 May 2022
Memorization Without Overfitting: Analyzing the Training Dynamics of
  Large Language Models
Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models
Kushal Tirumala
Aram H. Markosyan
Luke Zettlemoyer
Armen Aghajanyan
TDI
29
187
0
22 May 2022
Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime
Sharp Asymptotics of Kernel Ridge Regression Beyond the Linear Regime
Hong Hu
Yue M. Lu
53
15
0
13 May 2022
Overparameterization Improves StyleGAN Inversion
Overparameterization Improves StyleGAN Inversion
Yohan Poirier-Ginter
Alexandre Lessard
Ryan Smith
Jean-François Lalonde
43
4
0
12 May 2022
Investigating Generalization by Controlling Normalized Margin
Investigating Generalization by Controlling Normalized Margin
Alexander R. Farhang
Jeremy Bernstein
Kushal Tirumala
Yang Liu
Yisong Yue
31
6
0
08 May 2022
Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot
  Learning
Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot Learning
Mathias Lechner
Alexander Amini
Daniela Rus
T. Henzinger
AAML
29
9
0
15 Apr 2022
Machine Learning and Deep Learning -- A review for Ecologists
Machine Learning and Deep Learning -- A review for Ecologists
Maximilian Pichler
F. Hartig
45
127
0
11 Apr 2022
Discovering and forecasting extreme events via active learning in neural
  operators
Discovering and forecasting extreme events via active learning in neural operators
Ethan Pickering
Stephen Guth
George Karniadakis
T. Sapsis
AI4CE
19
57
0
05 Apr 2022
Evolving Neural Selection with Adaptive Regularization
Evolving Neural Selection with Adaptive Regularization
Li Ding
Lee Spector
ODL
25
4
0
04 Apr 2022
Random matrix analysis of deep neural network weight matrices
Random matrix analysis of deep neural network weight matrices
M. Thamm
Max Staats
B. Rosenow
35
12
0
28 Mar 2022
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
VoViT: Low Latency Graph-based Audio-Visual Voice Separation Transformer
Juan F. Montesinos
V. S. Kadandale
G. Haro
ViT
23
19
0
08 Mar 2022
On the Optimization Landscape of Neural Collapse under MSE Loss: Global
  Optimality with Unconstrained Features
On the Optimization Landscape of Neural Collapse under MSE Loss: Global Optimality with Unconstrained Features
Jinxin Zhou
Xiao Li
Tian Ding
Chong You
Qing Qu
Zhihui Zhu
27
99
0
02 Mar 2022
Contrasting random and learned features in deep Bayesian linear
  regression
Contrasting random and learned features in deep Bayesian linear regression
Jacob A. Zavatone-Veth
William L. Tong
Cengiz Pehlevan
BDL
MLT
28
26
0
01 Mar 2022
Deconstructing Distributions: A Pointwise Framework of Learning
Deconstructing Distributions: A Pointwise Framework of Learning
Gal Kaplun
Nikhil Ghosh
Saurabh Garg
Boaz Barak
Preetum Nakkiran
OOD
33
21
0
20 Feb 2022
Overparametrization improves robustness against adversarial attacks: A
  replication study
Overparametrization improves robustness against adversarial attacks: A replication study
Ali Borji
AAML
19
1
0
20 Feb 2022
Deep Ensembles Work, But Are They Necessary?
Deep Ensembles Work, But Are They Necessary?
Taiga Abe
E. Kelly Buchanan
Geoff Pleiss
R. Zemel
John P. Cunningham
OOD
UQCV
44
60
0
14 Feb 2022
Understanding Rare Spurious Correlations in Neural Networks
Understanding Rare Spurious Correlations in Neural Networks
Yao-Yuan Yang
Chi-Ning Chou
Kamalika Chaudhuri
AAML
16
25
0
10 Feb 2022
Evaluating natural language processing models with generalization
  metrics that do not need access to any training or testing data
Evaluating natural language processing models with generalization metrics that do not need access to any training or testing data
Yaoqing Yang
Ryan Theisen
Liam Hodgkinson
Joseph E. Gonzalez
Kannan Ramchandran
Charles H. Martin
Michael W. Mahoney
88
17
0
06 Feb 2022
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
Fast Convex Optimization for Two-Layer ReLU Networks: Equivalent Model Classes and Cone Decompositions
Aaron Mishkin
Arda Sahiner
Mert Pilanci
OffRL
77
30
0
02 Feb 2022
Learning Curves for Decision Making in Supervised Machine Learning: A Survey
Learning Curves for Decision Making in Supervised Machine Learning: A Survey
F. Mohr
Jan N. van Rijn
41
53
0
28 Jan 2022
On the Robustness of Sparse Counterfactual Explanations to Adverse
  Perturbations
On the Robustness of Sparse Counterfactual Explanations to Adverse Perturbations
M. Virgolin
Saverio Fracaros
CML
26
36
0
22 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
A Kernel-Expanded Stochastic Neural Network
A Kernel-Expanded Stochastic Neural Network
Y. Sun
F. Liang
23
5
0
14 Jan 2022
The Effect of Model Size on Worst-Group Generalization
The Effect of Model Size on Worst-Group Generalization
Alan Pham
Eunice Chan
V. Srivatsa
Dhruba Ghosh
Yaoqing Yang
Yaodong Yu
Ruiqi Zhong
Joseph E. Gonzalez
Jacob Steinhardt
23
5
0
08 Dec 2021
A generalization gap estimation for overparameterized models via the
  Langevin functional variance
A generalization gap estimation for overparameterized models via the Langevin functional variance
Akifumi Okuno
Keisuke Yano
41
1
0
07 Dec 2021
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Multi-scale Feature Learning Dynamics: Insights for Double Descent
Mohammad Pezeshki
Amartya Mitra
Yoshua Bengio
Guillaume Lajoie
61
25
0
06 Dec 2021
Learning Curves for Continual Learning in Neural Networks:
  Self-Knowledge Transfer and Forgetting
Learning Curves for Continual Learning in Neural Networks: Self-Knowledge Transfer and Forgetting
Ryo Karakida
S. Akaho
CLL
32
11
0
03 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural
  Network Models
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao Song
Atri Rudra
Christopher Ré
33
75
0
30 Nov 2021
The Geometric Occam's Razor Implicit in Deep Learning
The Geometric Occam's Razor Implicit in Deep Learning
Benoit Dherin
Micheal Munn
David Barrett
22
6
0
30 Nov 2021
Information-Theoretic Bayes Risk Lower Bounds for Realizable Models
Information-Theoretic Bayes Risk Lower Bounds for Realizable Models
M. Nokleby
Ahmad Beirami
59
1
0
08 Nov 2021
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
Mean-field Analysis of Piecewise Linear Solutions for Wide ReLU Networks
A. Shevchenko
Vyacheslav Kungurtsev
Marco Mondelli
MLT
41
13
0
03 Nov 2021
Whole Brain Segmentation with Full Volume Neural Network
Whole Brain Segmentation with Full Volume Neural Network
Yeshu Li
Jianwei Cui
Yilun Sheng
Xiao Liang
Jingdong Wang
E. Chang
Yan Xu
32
11
0
29 Oct 2021
Model, sample, and epoch-wise descents: exact solution of gradient flow
  in the random feature model
Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model
A. Bodin
N. Macris
37
13
0
22 Oct 2021
Behavioral Experiments for Understanding Catastrophic Forgetting
Behavioral Experiments for Understanding Catastrophic Forgetting
Samuel J. Bell
Neil D. Lawrence
32
4
0
20 Oct 2021
The Role of Permutation Invariance in Linear Mode Connectivity of Neural
  Networks
The Role of Permutation Invariance in Linear Mode Connectivity of Neural Networks
R. Entezari
Hanie Sedghi
O. Saukh
Behnam Neyshabur
MoMe
39
216
0
12 Oct 2021
Label Noise in Adversarial Training: A Novel Perspective to Study Robust
  Overfitting
Label Noise in Adversarial Training: A Novel Perspective to Study Robust Overfitting
Chengyu Dong
Liyuan Liu
Jingbo Shang
NoLa
AAML
56
18
0
07 Oct 2021
On the Generalization of Models Trained with SGD: Information-Theoretic
  Bounds and Implications
On the Generalization of Models Trained with SGD: Information-Theoretic Bounds and Implications
Ziqiao Wang
Yongyi Mao
FedML
MLT
37
22
0
07 Oct 2021
Learning through atypical "phase transitions" in overparameterized
  neural networks
Learning through atypical "phase transitions" in overparameterized neural networks
Carlo Baldassi
Clarissa Lauditi
Enrico M. Malatesta
R. Pacelli
Gabriele Perugini
R. Zecchina
26
26
0
01 Oct 2021
Powerpropagation: A sparsity inducing weight reparameterisation
Powerpropagation: A sparsity inducing weight reparameterisation
Jonathan Richard Schwarz
Siddhant M. Jayakumar
Razvan Pascanu
P. Latham
Yee Whye Teh
90
54
0
01 Oct 2021
Is the Number of Trainable Parameters All That Actually Matters?
Is the Number of Trainable Parameters All That Actually Matters?
A. Chatelain
Amine Djeghri
Daniel Hesslow
Julien Launay
Iacopo Poli
51
7
0
24 Sep 2021
Neural forecasting at scale
Neural forecasting at scale
Philippe Chatigny
Shengrui Wang
Jean-Marc Patenaude and
Boris N. Oreshkin
AI4TS
30
1
0
20 Sep 2021
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of
  Overparameterized Machine Learning
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning
Yehuda Dar
Vidya Muthukumar
Richard G. Baraniuk
34
71
0
06 Sep 2021
The Devil is in the Detail: Simple Tricks Improve Systematic
  Generalization of Transformers
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers
Róbert Csordás
Kazuki Irie
Jürgen Schmidhuber
ViT
30
128
0
26 Aug 2021
Interpolation can hurt robust generalization even when there is no noise
Interpolation can hurt robust generalization even when there is no noise
Konstantin Donhauser
Alexandru cTifrea
Michael Aerni
Reinhard Heckel
Fanny Yang
34
14
0
05 Aug 2021
Simple, Fast, and Flexible Framework for Matrix Completion with Infinite
  Width Neural Networks
Simple, Fast, and Flexible Framework for Matrix Completion with Infinite Width Neural Networks
Adityanarayanan Radhakrishnan
George Stefanakis
M. Belkin
Caroline Uhler
30
25
0
31 Jul 2021
On the Efficacy of Small Self-Supervised Contrastive Models without
  Distillation Signals
On the Efficacy of Small Self-Supervised Contrastive Models without Distillation Signals
Haizhou Shi
Youcai Zhang
Siliang Tang
Wenjie Zhu
Yaqian Li
Yandong Guo
Yueting Zhuang
SyDa
23
14
0
30 Jul 2021
A Theory of PAC Learnability of Partial Concept Classes
A Theory of PAC Learnability of Partial Concept Classes
N. Alon
Steve Hanneke
R. Holzman
Shay Moran
25
50
0
18 Jul 2021
A Mechanism for Producing Aligned Latent Spaces with Autoencoders
A Mechanism for Producing Aligned Latent Spaces with Autoencoders
Saachi Jain
Adityanarayanan Radhakrishnan
Caroline Uhler
21
9
0
29 Jun 2021
Jitter: Random Jittering Loss Function
Jitter: Random Jittering Loss Function
Zhicheng Cai
Chenglei Peng
S. Du
21
3
0
25 Jun 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian
  Process Perspective
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
Geoff Pleiss
John P. Cunningham
28
24
0
11 Jun 2021
Previous
12345
Next