Scaling description of generalization with number of parameters in deep learning

6 January 2019

Papers citing "Scaling description of generalization with number of parameters in deep learning"

40 / 40 papers shown

Title
The Double Descent Behavior in Two Layer Neural Network for Binary Classification Chathurika S Abeykoon A. Beknazaryan Hailin Sang 53 1 0 27 Apr 2025
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning Arthur Jacot Seok Hoan Choi Yuxiao Wen AI4CE 91 2 0 08 Jul 2024
When are ensembles really effective? Ryan Theisen Hyunsuk Kim Yaoqing Yang Liam Hodgkinson Michael W. Mahoney FedML UQCV 35 15 0 21 May 2023
Gradient flow in the gaussian covariate model: exact solution of learning curves and multiple descent structures Antione Bodin N. Macris 34 4 0 13 Dec 2022
Continual task learning in natural and artificial agents Timo Flesch Andrew M. Saxe Christopher Summerfield CLL 43 24 0 10 Oct 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 52 6 0 17 Sep 2022
Learning sparse features can lead to overfitting in neural networks Leonardo Petrini Francesco Cagnetta Eric Vanden-Eijnden M. Wyart MLT 42 23 0 24 Jun 2022
Contrasting random and learned features in deep Bayesian linear regression Jacob A. Zavatone-Veth William L. Tong Cengiz Pehlevan BDL MLT 28 26 0 01 Mar 2022
A generalization gap estimation for overparameterized models via the Langevin functional variance Akifumi Okuno Keisuke Yano 38 1 0 07 Dec 2021
Multi-scale Feature Learning Dynamics: Insights for Double Descent Mohammad Pezeshki Amartya Mitra Yoshua Bengio Guillaume Lajoie 61 25 0 06 Dec 2021
On the Effectiveness of Neural Ensembles for Image Classification with Small Datasets Lorenzo Brigato Luca Iocchi UQCV 30 0 0 29 Nov 2021
Model, sample, and epoch-wise descents: exact solution of gradient flow in the random feature model A. Bodin N. Macris 37 13 0 22 Oct 2021
Learning through atypical "phase transitions" in overparameterized neural networks Carlo Baldassi Clarissa Lauditi Enrico M. Malatesta R. Pacelli Gabriele Perugini R. Zecchina 26 26 0 01 Oct 2021
Scaling Laws for Neural Machine Translation Behrooz Ghorbani Orhan Firat Markus Freitag Ankur Bapna M. Krikun Xavier Garcia Ciprian Chelba Colin Cherry 40 99 0 16 Sep 2021
A Farewell to the Bias-Variance Tradeoff? An Overview of the Theory of Overparameterized Machine Learning Yehuda Dar Vidya Muthukumar Richard G. Baraniuk 31 71 0 06 Sep 2021
Repulsive Deep Ensembles are Bayesian Francesco DÁngelo Vincent Fortuin UQCV BDL 59 94 0 22 Jun 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective Geoff Pleiss John P. Cunningham 28 24 0 11 Jun 2021
Explaining Neural Scaling Laws Yasaman Bahri Ethan Dyer Jared Kaplan Jaehoon Lee Utkarsh Sharma 27 250 0 12 Feb 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks Asaf Noy Yi Tian Xu Y. Aflalo Lihi Zelnik-Manor R. L. Jin 36 3 0 12 Jan 2021
Understanding Double Descent Requires a Fine-Grained Bias-Variance Decomposition Ben Adlam Jeffrey Pennington UD 39 93 0 04 Nov 2020
Understanding Capacity-Driven Scale-Out Neural Recommendation Inference Michael Lui Yavuz Yetim Özgür Özkan Zhuoran Zhao Shin-Yeh Tsai Carole-Jean Wu Mark Hempstead GNN BDL LRM 22 51 0 04 Nov 2020
Memorizing without overfitting: Bias, variance, and interpolation in over-parameterized models J. Rocks Pankaj Mehta 18 41 0 26 Oct 2020
Review: Deep Learning in Electron Microscopy Jeffrey M. Ede 34 79 0 17 Sep 2020
Multiple Descent: Design Your Own Generalization Curve Lin Chen Yifei Min M. Belkin Amin Karbasi DRL 28 61 0 03 Aug 2020
Geometric compression of invariant manifolds in neural nets J. Paccolat Leonardo Petrini Mario Geiger Kevin Tyloo M. Wyart MLT 55 34 0 22 Jul 2020
GShard: Scaling Giant Models with Conditional Computation and Automatic Sharding Dmitry Lepikhin HyoukJoong Lee Yuanzhong Xu Dehao Chen Orhan Firat Yanping Huang M. Krikun Noam M. Shazeer Z. Chen MoE 31 1,108 0 30 Jun 2020
An analytic theory of shallow networks dynamics for hinge loss classification Franco Pellegrini Giulio Biroli 35 19 0 19 Jun 2020
On the training dynamics of deep networks with $L_2$ regularization Aitor Lewkowycz Guy Gur-Ari 41 53 0 15 Jun 2020
Double Double Descent: On Generalization Errors in Transfer Learning between Linear Regression Tasks Yehuda Dar Richard G. Baraniuk 36 19 0 12 Jun 2020
Double Descent Risk and Volume Saturation Effects: A Geometric Perspective Prasad Cheema M. Sugiyama 14 3 0 08 Jun 2020
Predicting the outputs of finite deep neural networks trained with noisy gradients Gadi Naveh Oded Ben-David H. Sompolinsky Zohar Ringel 19 20 0 02 Apr 2020
Double Trouble in Double Descent : Bias and Variance(s) in the Lazy Regime Stéphane dÁscoli Maria Refinetti Giulio Biroli Florent Krzakala 93 152 0 02 Mar 2020
Generalisation error in learning with random features and the hidden manifold model Federica Gerace Bruno Loureiro Florent Krzakala M. Mézard Lenka Zdeborová 25 165 0 21 Feb 2020
Implicit Regularization of Random Feature Models Arthur Jacot Berfin Simsek Francesco Spadaro Clément Hongler Franck Gabriel 31 82 0 19 Feb 2020
Exact expressions for double descent and implicit regularization via surrogate random design Michal Derezinski Feynman T. Liang Michael W. Mahoney 27 77 0 10 Dec 2019
A Model of Double Descent for High-dimensional Binary Linear Classification Zeyu Deng A. Kammoun Christos Thrampoulidis 36 145 0 13 Nov 2019
Asymptotics of Wide Networks from Feynman Diagrams Ethan Dyer Guy Gur-Ari 29 113 0 25 Sep 2019
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 57 626 0 14 Aug 2019
A type of generalization error induced by initialization in deep neural networks Yaoyu Zhang Zhi-Qin John Xu Tao Luo Zheng Ma 9 49 0 19 May 2019
The Loss Surfaces of Multilayer Networks A. Choromańska Mikael Henaff Michaël Mathieu Gerard Ben Arous Yann LeCun ODL 183 1,185 0 30 Nov 2014