Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

18 February 2019

Jascha Narain Sohl-Dickstein

Jeffrey Pennington

ArXiv PDF HTML

Papers citing "Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent"

50 / 261 papers shown

Title
On the Weight Dynamics of Deep Normalized Networks Christian H. X. Ali Mehmeti-Göpel Michael Wand 38 1 0 01 Jun 2023
Mind the spikes: Benign overfitting of kernels and neural networks in fixed dimension Moritz Haas David Holzmüller U. V. Luxburg Ingo Steinwart MLT 35 14 0 23 May 2023
Tight conditions for when the NTK approximation is valid Enric Boix-Adserà Etai Littwin 30 0 0 22 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models Guillermo Ortiz-Jiménez Alessandro Favero P. Frossard MoMe 51 112 0 22 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains Yicheng Li Zixiong Yu Y. Cotronis Qian Lin 55 13 0 04 May 2023
Automatic Gradient Descent: Deep Learning without Hyperparameters Jeremy Bernstein Chris Mingard Kevin Huang Navid Azizan Yisong Yue ODL 16 17 0 11 Apr 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon Cengiz Pehlevan MLT 38 29 0 06 Apr 2023
Wide neural networks: From non-gaussian random fields at initialization to the NTK geometry of training Luís Carvalho Joao L. Costa José Mourao Gonccalo Oliveira AI4CE 26 1 0 06 Apr 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics Yite Wang Dawei Li Ruoyu Sun 39 19 0 06 Apr 2023
Competitive plasticity to reduce the energetic costs of learning Mark C. W. van Rossum 15 2 0 04 Apr 2023
On the Stepwise Nature of Self-Supervised Learning James B. Simon Maksis Knutins Liu Ziyin Daniel Geisz Abraham J. Fetterman Joshua Albrecht SSL 37 30 0 27 Mar 2023
Online Learning for the Random Feature Model in the Student-Teacher Framework Roman Worschech B. Rosenow 46 0 0 24 Mar 2023
Controlled Descent Training Viktor Andersson B. Varga Vincent Szolnoky Andreas Syrén Rebecka Jörnsten Balázs Kulcsár 43 1 0 16 Mar 2023
SAM operates far from home: eigenvalue regularization as a dynamical phenomenon Atish Agarwala Yann N. Dauphin 21 20 0 17 Feb 2023
Dataset Distillation with Convexified Implicit Gradients Noel Loo Ramin Hasani Mathias Lechner Daniela Rus DD 31 41 0 13 Feb 2023
How to prepare your task head for finetuning Yi Ren Shangmin Guo Wonho Bae Danica J. Sutherland 24 14 0 11 Feb 2023
Efficient Parametric Approximations of Neural Network Function Space Distance Nikita Dhawan Sicong Huang Juhan Bae Roger C. Grosse 16 5 0 07 Feb 2023
The SSL Interplay: Augmentations, Inductive Bias, and Generalization Vivien A. Cabannes B. Kiani Randall Balestriero Yann LeCun A. Bietti SSL 19 31 0 06 Feb 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 18 7 0 03 Feb 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation Noel Loo Ramin Hasani Mathias Lechner Alexander Amini Daniela Rus DD 42 5 0 02 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 64 2 0 02 Feb 2023
Bayes-optimal Learning of Deep Random Networks of Extensive-width Hugo Cui Florent Krzakala Lenka Zdeborová BDL 25 35 0 01 Feb 2023
Supervision Complexity and its Role in Knowledge Distillation Hrayr Harutyunyan A. S. Rawat A. Menon Seungyeon Kim Surinder Kumar 30 12 0 28 Jan 2023
A Simple Algorithm For Scaling Up Kernel Methods Tengyu Xu Bryan Kelly Semyon Malamud 16 0 0 26 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Guihong Li Yuedong Yang Kartikeya Bhardwaj R. Marculescu 36 61 0 26 Jan 2023
Catapult Dynamics and Phase Transitions in Quadratic Nets David Meltzer Junyu Liu 27 9 0 18 Jan 2023
Dataset Distillation: A Comprehensive Review Ruonan Yu Songhua Liu Xinchao Wang DD 53 121 0 17 Jan 2023
Effects of Data Geometry in Early Deep Learning Saket Tiwari George Konidaris 79 7 0 29 Dec 2022
The Underlying Correlated Dynamics in Neural Training Rotem Turjeman Tom Berkov I. Cohen Guy Gilboa 27 3 0 18 Dec 2022
Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs Chenxiao Yang Qitian Wu Jiahua Wang Junchi Yan AI4CE 19 51 0 18 Dec 2022
Leveraging Unlabeled Data to Track Memorization Mahsa Forouzesh Hanie Sedghi Patrick Thiran NoLa TDI 34 4 0 08 Dec 2022
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels Kangyu Weng Aohua Cheng Ziyang Zhang Pei Sun Yang Tian 53 2 0 04 Dec 2022
Neural tangent kernel analysis of PINN for advection-diffusion equation M. Saadat B. Gjorgiev L. Das G. Sansavini 33 0 0 21 Nov 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion Michael Murray Hui Jin Benjamin Bowman Guido Montúfar 38 11 0 15 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare? Julius Martinetz T. Martinetz 30 1 0 07 Nov 2022
Understanding and Mitigating Overfitting in Prompt Tuning for Vision-Language Models Cheng Ma Yang Liu Jiankang Deng Lingxi Xie Weiming Dong Changsheng Xu VLM VPVLM 34 44 0 04 Nov 2022
A Solvable Model of Neural Scaling Laws A. Maloney Daniel A. Roberts J. Sully 36 51 0 30 Oct 2022
Proximal Mean Field Learning in Shallow Neural Networks Alexis M. H. Teter Iman Nodozi A. Halder FedML 43 1 0 25 Oct 2022
Evolution of Neural Tangent Kernels under Benign and Adversarial Training Noel Loo Ramin Hasani Alexander Amini Daniela Rus AAML 34 13 0 21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 26 10 0 21 Oct 2022
Bayesian deep learning framework for uncertainty quantification in high dimensions Jeahan Jung Minseok Choi BDL UQCV 21 1 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 26 5 0 20 Oct 2022
Understanding Impacts of Task Similarity on Backdoor Attack and Detection Di Tang Rui Zhu Xiaofeng Wang Haixu Tang Yi Chen AAML 24 5 0 12 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? Nikolaos Tsilivis Julia Kempe AAML 44 17 0 11 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies Sho Yaida 58 16 0 10 Oct 2022
Second-order regression models exhibit progressive sharpening to the edge of stability Atish Agarwala Fabian Pedregosa Jeffrey Pennington 35 26 0 10 Oct 2022
Continual task learning in natural and artificial agents Timo Flesch Andrew M. Saxe Christopher Summerfield CLL 43 24 0 10 Oct 2022
Critical Learning Periods for Multisensory Integration in Deep Networks Michael Kleinman Alessandro Achille Stefano Soatto 35 10 0 06 Oct 2022
FedMT: Federated Learning with Mixed-type Labels Qiong Zhang Jing Peng Xin Zhang A. Talhouk Gang Niu Xiaoxiao Li FedML 56 0 0 05 Oct 2022
Scale-invariant Bayesian Neural Networks with Connectivity Tangent Kernel Sungyub Kim Si-hun Park Kyungsu Kim Eunho Yang BDL 29 4 0 30 Sep 2022