On Lazy Training in Differentiable Programming

19 December 2018

Papers citing "On Lazy Training in Differentiable Programming"

50 / 246 papers shown

Title
Quantitative CLTs in Deep Neural Networks Stefano Favaro Boris Hanin Domenico Marinucci I. Nourdin G. Peccati BDL 41 12 0 12 Jul 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions Nishil Patel Sebastian Lee Stefano Sarao Mannelli Sebastian Goldt Adrew Saxe OffRL 36 3 0 17 Jun 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks Puyu Wang Yunwen Lei Di Wang Yiming Ying Ding-Xuan Zhou MLT 29 4 0 26 May 2023
Tight conditions for when the NTK approximation is valid Enric Boix-Adserà Etai Littwin 35 0 0 22 May 2023
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models Guillermo Ortiz-Jiménez Alessandro Favero P. Frossard MoMe 51 113 0 22 May 2023
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features Simone Bombari Marco Mondelli AAML 42 5 0 20 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks Eshaan Nichani Alexandru Damian Jason D. Lee MLT 47 13 0 11 May 2023
Infinitely wide limits for deep Stable neural networks: sub-linear, linear and super-linear activation functions Alberto Bordino Stefano Favaro S. Fortini 32 7 0 08 Apr 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon Cengiz Pehlevan MLT 38 29 0 06 Apr 2023
Wide neural networks: From non-gaussian random fields at initialization to the NTK geometry of training Luís Carvalho Joao L. Costa José Mourao Gonccalo Oliveira AI4CE 26 1 0 06 Apr 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks Scott Pesme Nicolas Flammarion 42 35 0 02 Apr 2023
Analyzing Convergence in Quantum Neural Networks: Deviations from Neural Tangent Kernels Xuchen You Shouvanik Chakrabarti Boyang Chen Xiaodi Wu 42 10 0 26 Mar 2023
Online Learning for the Random Feature Model in the Student-Teacher Framework Roman Worschech B. Rosenow 46 0 0 24 Mar 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks Zheng Chen Yuqing Li Tao Luo Zhaoguang Zhou Z. Xu MLT AI4CE 49 9 0 12 Mar 2023
Sparsity May Cry: Let Us Fail (Current) Sparse Neural Networks Together! Shiwei Liu Tianlong Chen Zhenyu Zhang Xuxi Chen Tianjin Huang Ajay Jaiswal Zhangyang Wang 37 29 0 03 Mar 2023
Differentially Private Neural Tangent Kernels for Privacy-Preserving Data Generation Yilin Yang Kamil Adamczewski Danica J. Sutherland Xiaoxiao Li Mijung Park 33 14 0 03 Mar 2023
The Ladder in Chaos: A Simple and Effective Improvement to General DRL Algorithms by Policy Path Trimming and Boosting Hongyao Tang Mengdi Zhang Jianye Hao 28 1 0 02 Mar 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 39 16 0 20 Feb 2023
Dataset Distillation with Convexified Implicit Gradients Noel Loo Ramin Hasani Mathias Lechner Daniela Rus DD 31 42 0 13 Feb 2023
How to prepare your task head for finetuning Yi Ren Shangmin Guo Wonho Bae Danica J. Sutherland 24 14 0 11 Feb 2023
Beyond the Universal Law of Robustness: Sharper Laws for Random Features and Neural Tangent Kernels Simone Bombari Shayan Kiyani Marco Mondelli AAML 46 10 0 03 Feb 2023
Understanding Reconstruction Attacks with the Neural Tangent Kernel and Dataset Distillation Noel Loo Ramin Hasani Mathias Lechner Alexander Amini Daniela Rus DD 42 5 0 02 Feb 2023
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning François Caron Fadhel Ayed Paul Jung Hoileong Lee Juho Lee Hongseok Yang 67 2 0 02 Feb 2023
Dissecting the Effects of SGD Noise in Distinct Regimes of Deep Learning Antonio Sclocchi Mario Geiger M. Wyart 40 6 0 31 Jan 2023
A Simple Algorithm For Scaling Up Kernel Methods Tengyu Xu Bryan Kelly Semyon Malamud 23 0 0 26 Jan 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients Guihong Li Yuedong Yang Kartikeya Bhardwaj R. Marculescu 36 61 0 26 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models Yufeng Zhang Boyi Liu Qi Cai Lingxiao Wang Zhaoran Wang 53 11 0 30 Dec 2022
The Quantum Path Kernel: a Generalized Quantum Neural Tangent Kernel for Deep Quantum Machine Learning Massimiliano Incudini Michele Grossi Antonio Mandarino S. Vallecorsa Alessandra Di Pierro David Windridge 38 6 0 22 Dec 2022
Learning threshold neurons via the "edge of stability" Kwangjun Ahn Sébastien Bubeck Sinho Chewi Y. Lee Felipe Suarez Yi Zhang MLT 38 36 0 14 Dec 2022
Selective Amnesia: On Efficient, High-Fidelity and Blind Suppression of Backdoor Effects in Trojaned Machine Learning Models Rui Zhu Di Tang Siyuan Tang Xiaofeng Wang Haixu Tang AAML FedML 37 13 0 09 Dec 2022
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels Kangyu Weng Aohua Cheng Ziyang Zhang Pei Sun Yang Tian 53 2 0 04 Dec 2022
Infinite-width limit of deep linear neural networks Lénaïc Chizat Maria Colombo Xavier Fernández-Real Alessio Figalli 31 14 0 29 Nov 2022
A Kernel Perspective of Skip Connections in Convolutional Networks Daniel Barzilai Amnon Geifman Meirav Galun Ronen Basri 23 12 0 27 Nov 2022
Why Neural Networks Work Sayan Mukherjee Bernardo A. Huberman 19 2 0 26 Nov 2022
Linear Interpolation In Parameter Space is Good Enough for Fine-Tuned Language Models Mark Rofin Nikita Balagansky Daniil Gavrilov MoMe KELM 38 5 0 22 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare? Julius Martinetz T. Martinetz 30 1 0 07 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 27 5 0 28 Oct 2022
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 170 68 0 27 Oct 2022
Evolution of Neural Tangent Kernels under Benign and Adversarial Training Noel Loo Ramin Hasani Alexander Amini Daniela Rus AAML 36 13 0 21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 31 10 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 26 5 0 20 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness? Nikolaos Tsilivis Julia Kempe AAML 47 18 0 11 Oct 2022
SGD with Large Step Sizes Learns Sparse Features Maksym Andriushchenko Aditya Varre Loucas Pillaud-Vivien Nicolas Flammarion 45 56 0 11 Oct 2022
Meta-Principled Family of Hyperparameter Scaling Strategies Sho Yaida 58 16 0 10 Oct 2022
Continual task learning in natural and artificial agents Timo Flesch Andrew M. Saxe Christopher Summerfield CLL 43 24 0 10 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 34 1 0 10 Oct 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD Alireza Mousavi-Hosseini Sejun Park M. Girotti Ioannis Mitliagkas Murat A. Erdogdu MLT 324 48 0 29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons Sangmin Lee Byeongsu Sim Jong Chul Ye MLT 96 6 0 27 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty Thomas George Guillaume Lajoie A. Baratin 34 5 0 19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in $1d$ R. Gentile G. Welper ODL 56 6 0 17 Sep 2022