When and how epochwise double descent happens

26 August 2021

Papers citing "When and how epochwise double descent happens"

23 / 23 papers shown

Title
On the Relationship Between Double Descent of CNNs and Shape/Texture Bias Under Learning Process Shun Iwase Shuya Takahashi Nakamasa Inoue Rio Yokota Ryo Nakamura Hirokatsu Kataoka 133 0 0 04 Mar 2025
Double Descent Meets Out-of-Distribution Detection: Theoretical Insights and Empirical Analysis on the role of model complexity Mouin Ben Ammar David Brellmann Arturo Mendoza Antoine Manzanera Gianni Franchi OODD 87 0 0 04 Nov 2024
On the geometry of generalization and memorization in deep neural networks Cory Stephenson Suchismita Padhy Abhinav Ganesh Yue Hui Hanlin Tang SueYeon Chung TDI AI4CE 77 73 0 30 May 2021
Pervasive Label Errors in Test Sets Destabilize Machine Learning Benchmarks Curtis G. Northcutt Anish Athalye Jonas W. Mueller 73 532 0 26 Mar 2021
Early Stopping in Deep Networks: Double Descent and How to Eliminate it Reinhard Heckel Fatih Yilmaz 65 45 0 20 Jul 2020
A Brief Prehistory of Double Descent Marco Loog T. Viering A. Mey Jesse H. Krijthe David Tax 49 69 0 07 Apr 2020
More Data Can Hurt for Linear Regression: Sample-wise Double Descent Preetum Nakkiran 50 68 0 16 Dec 2019
Deep Double Descent: Where Bigger Models and More Data Hurt Preetum Nakkiran Gal Kaplun Yamini Bansal Tristan Yang Boaz Barak Ilya Sutskever 121 942 0 04 Dec 2019
PyTorch: An Imperative Style, High-Performance Deep Learning Library Adam Paszke Sam Gross Francisco Massa Adam Lerer James Bradbury ... Sasank Chilamkurthy Benoit Steiner Lu Fang Junjie Bai Soumith Chintala ODL 511 42,449 0 03 Dec 2019
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 83 635 0 14 Aug 2019
When Does Label Smoothing Help? Rafael Müller Simon Kornblith Geoffrey E. Hinton UQCV 195 1,950 0 06 Jun 2019
SGD on Neural Networks Learns Functions of Increasing Complexity Preetum Nakkiran Gal Kaplun Dimitris Kalimeris Tristan Yang Benjamin L. Edelman Fred Zhang Boaz Barak MLT 128 247 0 28 May 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent Jaehoon Lee Lechao Xiao S. Schoenholz Yasaman Bahri Roman Novak Jascha Narain Sohl-Dickstein Jeffrey Pennington 211 1,104 0 18 Feb 2019
Reconciling modern machine learning practice and the bias-variance trade-off M. Belkin Daniel J. Hsu Siyuan Ma Soumik Mandal 232 1,650 0 28 Dec 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 219 1,272 0 04 Oct 2018
Insights on representational similarity in neural networks with canonical correlation Ari S. Morcos M. Raghu Samy Bengio DRL 63 446 0 14 Jun 2018
To understand deep learning we need to understand kernel learning M. Belkin Siyuan Ma Soumik Mandal 60 419 0 05 Feb 2018
High-dimensional dynamics of generalization error in neural networks Madhu S. Advani Andrew M. Saxe AI4CE 139 469 0 10 Oct 2017
Understanding deep learning requires rethinking generalization Chiyuan Zhang Samy Bengio Moritz Hardt Benjamin Recht Oriol Vinyals HAI 339 4,629 0 10 Nov 2016
An overview of gradient descent optimization algorithms Sebastian Ruder ODL 204 6,188 0 15 Sep 2016
Deep Residual Learning for Image Recognition Kaiming He Xinming Zhang Shaoqing Ren Jian Sun MedIm 2.2K 194,020 0 10 Dec 2015
Rethinking the Inception Architecture for Computer Vision Christian Szegedy Vincent Vanhoucke Sergey Ioffe Jonathon Shlens Z. Wojna 3DV BDL 883 27,373 0 02 Dec 2015
Practical recommendations for gradient-based training of deep architectures Yoshua Bengio 3DH ODL 193 2,200 0 24 Jun 2012