Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2305.18270
Cited By
v1
v2
v3
v4 (latest)
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
29 May 2023
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"How Two-Layer Neural Networks Learn, One (Giant) Step at a Time"
50 / 51 papers shown
Title
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Fabiola Ricci
Lorenzo Bardone
Sebastian Goldt
OOD
196
0
0
31 Mar 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
115
1
0
10 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon
Cengiz Pehlevan
AI4CE
162
1
0
04 Feb 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
115
1
0
10 Jan 2025
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance
Jean Barbier
Francesco Camilli
Justin Ko
Koki Okajima
82
6
0
04 Nov 2024
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
93
16
0
26 Sep 2024
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Behrad Moniri
Donghwan Lee
Hamed Hassani
Yan Sun
MLT
84
22
0
11 Oct 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Alexandru Damian
Eshaan Nichani
Rong Ge
Jason D. Lee
MLT
77
39
0
18 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks
Blake Bordelon
Cengiz Pehlevan
MLT
70
31
0
06 Apr 2023
Learning time-scales in two-layers neural networks
Raphael Berthier
Andrea Montanari
Kangjie Zhou
135
36
0
28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedML
MLT
138
86
0
21 Feb 2023
Universality laws for Gaussian mixtures in generalized linear models
Yatin Dandi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
Lenka Zdeborová
FedML
72
21
0
17 Feb 2023
Precise Asymptotic Analysis of Deep Random Feature Models
David Bosch
Ashkan Panahi
B. Hassibi
66
19
0
13 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
79
33
0
12 Feb 2023
Deterministic equivalent and error universality of deep random features learning
Dominik Schröder
Hugo Cui
Daniil Dmitriev
Bruno Loureiro
MLT
73
28
0
01 Feb 2023
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
193
71
0
27 Oct 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
90
123
0
30 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
Matthieu Wyart
MLT
78
25
0
24 Jun 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
102
59
0
08 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
51
61
0
02 Jun 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
87
128
0
03 May 2022
Universality of empirical risk minimization
Andrea Montanari
Basil Saeed
OOD
63
78
0
17 Feb 2022
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension
Bruno Loureiro
Cédric Gerbelot
Maria Refinetti
G. Sicuro
Florent Krzakala
72
27
0
31 Jan 2022
Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs
Inbar Seroussi
Gadi Naveh
Zohar Ringel
74
55
0
31 Dec 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect
Alexander B. Atanasov
Blake Bordelon
Cengiz Pehlevan
MLT
77
85
0
29 Oct 2021
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks
James B. Simon
Madeline Dickens
Dhruva Karkada
M. DeWeese
74
28
0
08 Oct 2021
The staircase property: How hierarchical structure can guide deep learning
Emmanuel Abbe
Enric Boix-Adserà
Matthew Brennan
Guy Bresler
Dheeraj M. Nagaraj
51
55
0
24 Aug 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
58
55
0
30 Jun 2021
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs
Gadi Naveh
Zohar Ringel
SSL
MLT
72
32
0
08 Jun 2021
Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime
Hugo Cui
Bruno Loureiro
Florent Krzakala
Lenka Zdeborová
76
85
0
31 May 2021
How rotational invariance of common kernels prevents generalization in high dimensions
Konstantin Donhauser
Mingqi Wu
Fanny Yang
76
24
0
09 Apr 2021
Learning curves of generic features maps for realistic datasets with a teacher-student model
Bruno Loureiro
Cédric Gerbelot
Hugo Cui
Sebastian Goldt
Florent Krzakala
M. Mézard
Lenka Zdeborová
99
140
0
16 Feb 2021
Generalization error of random features and kernel methods: hypercontractivity and kernel matrix concentration
Song Mei
Theodor Misiakiewicz
Andrea Montanari
84
112
0
26 Jan 2021
The Gaussian equivalence of generative models for learning with shallow neural networks
Sebastian Goldt
Bruno Loureiro
Galen Reeves
Florent Krzakala
M. Mézard
Lenka Zdeborová
BDL
85
107
0
25 Jun 2020
When Do Neural Networks Outperform Kernel Methods?
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
88
189
0
24 Jun 2020
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks
Abdulkadir Canatar
Blake Bordelon
Cengiz Pehlevan
97
189
0
23 Jun 2020
Online stochastic gradient descent on non-convex losses from high-dimensional inference
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
65
89
0
23 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
194
241
0
04 Mar 2020
Generalisation error in learning with random features and the hidden manifold model
Federica Gerace
Bruno Loureiro
Florent Krzakala
M. Mézard
Lenka Zdeborová
67
172
0
21 Feb 2020
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
Blake Bordelon
Abdulkadir Canatar
Cengiz Pehlevan
225
208
0
07 Feb 2020
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
95
639
0
14 Aug 2019
Limitations of Lazy Training of Two-layers Neural Networks
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
55
143
0
21 Jun 2019
SGD on Neural Networks Learns Functions of Increasing Complexity
Preetum Nakkiran
Gal Kaplun
Dimitris Kalimeris
Tristan Yang
Benjamin L. Edelman
Fred Zhang
Boaz Barak
MLT
133
247
0
28 May 2019
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
111
839
0
19 Dec 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Justin A. Sirignano
K. Spiliopoulos
MLT
75
194
0
28 Aug 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
212
736
0
24 May 2018
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach
Grant M. Rotskoff
Eric Vanden-Eijnden
114
123
0
02 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
98
861
0
18 Apr 2018
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
128
3,685
0
08 Jun 2017
Generalization Properties of Learning with Random Features
Alessandro Rudi
Lorenzo Rosasco
MLT
68
331
0
14 Feb 2016
1
2
Next