v1v2v3v4 (latest)

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

29 May 2023

Papers citing "How Two-Layer Neural Networks Learn, One (Giant) Step at a Time"

50 / 51 papers shown

Title
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions Fabiola Ricci Lorenzo Bardone Sebastian Goldt OOD 196 0 0 31 Mar 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions Elisabetta Cornacchia Dan Mikulincer Elchanan Mossel 115 1 0 10 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer Blake Bordelon Cengiz Pehlevan AI4CE 162 1 0 04 Feb 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input Ziang Chen Rong Ge MLT 115 1 0 10 Jan 2025
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance Jean Barbier Francesco Camilli Justin Ko Koki Okajima 82 6 0 04 Nov 2024
How Feature Learning Can Improve Neural Scaling Laws Blake Bordelon Alexander B. Atanasov Cengiz Pehlevan 93 16 0 26 Sep 2024
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks Behrad Moniri Donghwan Lee Hamed Hassani Yan Sun MLT 84 22 0 11 Oct 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models Alexandru Damian Eshaan Nichani Rong Ge Jason D. Lee MLT 77 39 0 18 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks Blake Bordelon Cengiz Pehlevan MLT 70 31 0 06 Apr 2023
Learning time-scales in two-layers neural networks Raphael Berthier Andrea Montanari Kangjie Zhou 135 36 0 28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics Emmanuel Abbe Enric Boix-Adserà Theodor Misiakiewicz FedML MLT 138 86 0 21 Feb 2023
Universality laws for Gaussian mixtures in generalized linear models Yatin Dandi Ludovic Stephan Florent Krzakala Bruno Loureiro Lenka Zdeborová FedML 72 21 0 17 Feb 2023
Precise Asymptotic Analysis of Deep Random Feature Models David Bosch Ashkan Panahi B. Hassibi 66 19 0 13 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks Luca Arnaboldi Ludovic Stephan Florent Krzakala Bruno Loureiro MLT 79 33 0 12 Feb 2023
Deterministic equivalent and error universality of deep random features learning Dominik Schröder Hugo Cui Daniil Dmitriev Bruno Loureiro MLT 73 28 0 01 Feb 2023
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 193 71 0 27 Oct 2022
Neural Networks can Learn Representations with Gradient Descent Alexandru Damian Jason D. Lee Mahdi Soltanolkotabi SSL MLT 90 123 0 30 Jun 2022
Learning sparse features can lead to overfitting in neural networks Leonardo Petrini Francesco Cagnetta Eric Vanden-Eijnden Matthieu Wyart MLT 78 25 0 24 Jun 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling Gerard Ben Arous Reza Gheissari Aukosh Jagannath 102 59 0 08 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 51 61 0 02 Jun 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 87 128 0 03 May 2022
Universality of empirical risk minimization Andrea Montanari Basil Saeed OOD 63 78 0 17 Feb 2022
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension Bruno Loureiro Cédric Gerbelot Maria Refinetti G. Sicuro Florent Krzakala 72 27 0 31 Jan 2022
Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs Inbar Seroussi Gadi Naveh Zohar Ringel 74 55 0 31 Dec 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect Alexander B. Atanasov Blake Bordelon Cengiz Pehlevan MLT 77 85 0 29 Oct 2021
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks James B. Simon Madeline Dickens Dhruva Karkada M. DeWeese 74 28 0 08 Oct 2021
The staircase property: How hierarchical structure can guide deep learning Emmanuel Abbe Enric Boix-Adserà Matthew Brennan Guy Bresler Dheeraj M. Nagaraj 51 55 0 24 Aug 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity Arthur Jacot François Ged Berfin cSimcsek Clément Hongler Franck Gabriel 58 55 0 30 Jun 2021
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs Gadi Naveh Zohar Ringel SSL MLT 72 32 0 08 Jun 2021
Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime Hugo Cui Bruno Loureiro Florent Krzakala Lenka Zdeborová 76 85 0 31 May 2021
How rotational invariance of common kernels prevents generalization in high dimensions Konstantin Donhauser Mingqi Wu Fanny Yang 76 24 0 09 Apr 2021
Learning curves of generic features maps for realistic datasets with a teacher-student model Bruno Loureiro Cédric Gerbelot Hugo Cui Sebastian Goldt Florent Krzakala M. Mézard Lenka Zdeborová 99 140 0 16 Feb 2021
Generalization error of random features and kernel methods: hypercontractivity and kernel matrix concentration Song Mei Theodor Misiakiewicz Andrea Montanari 84 112 0 26 Jan 2021
The Gaussian equivalence of generative models for learning with shallow neural networks Sebastian Goldt Bruno Loureiro Galen Reeves Florent Krzakala M. Mézard Lenka Zdeborová BDL 85 107 0 25 Jun 2020
When Do Neural Networks Outperform Kernel Methods? Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari 88 189 0 24 Jun 2020
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks Abdulkadir Canatar Blake Bordelon Cengiz Pehlevan 97 189 0 23 Jun 2020
Online stochastic gradient descent on non-convex losses from high-dimensional inference Gerard Ben Arous Reza Gheissari Aukosh Jagannath 65 89 0 23 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 194 241 0 04 Mar 2020
Generalisation error in learning with random features and the hidden manifold model Federica Gerace Bruno Loureiro Florent Krzakala M. Mézard Lenka Zdeborová 67 172 0 21 Feb 2020
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks Blake Bordelon Abdulkadir Canatar Cengiz Pehlevan 225 208 0 07 Feb 2020
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 95 639 0 14 Aug 2019
Limitations of Lazy Training of Two-layers Neural Networks Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 55 143 0 21 Jun 2019
SGD on Neural Networks Learns Functions of Increasing Complexity Preetum Nakkiran Gal Kaplun Dimitris Kalimeris Tristan Yang Benjamin L. Edelman Fred Zhang Boaz Barak MLT 133 247 0 28 May 2019
On Lazy Training in Differentiable Programming Lénaïc Chizat Edouard Oyallon Francis R. Bach 111 839 0 19 Dec 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem Justin A. Sirignano K. Spiliopoulos MLT 75 194 0 28 Aug 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport Lénaïc Chizat Francis R. Bach OT 212 736 0 24 May 2018
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach Grant M. Rotskoff Eric Vanden-Eijnden 114 123 0 02 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Andrea Montanari Phan-Minh Nguyen MLT 98 861 0 18 Apr 2018
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour Priya Goyal Piotr Dollár Ross B. Girshick P. Noordhuis Lukasz Wesolowski Aapo Kyrola Andrew Tulloch Yangqing Jia Kaiming He 3DH 128 3,685 0 08 Jun 2017
Generalization Properties of Learning with Random Features Alessandro Rudi Lorenzo Rosasco MLT 68 331 0 14 Feb 2016