ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2305.18270
  4. Cited By
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
v1v2v3v4 (latest)

How Two-Layer Neural Networks Learn, One (Giant) Step at a Time

29 May 2023
Yatin Dandi
Florent Krzakala
Bruno Loureiro
Luca Pesce
Ludovic Stephan
    MLT
ArXiv (abs)PDFHTML

Papers citing "How Two-Layer Neural Networks Learn, One (Giant) Step at a Time"

50 / 51 papers shown
Title
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Feature learning from non-Gaussian inputs: the case of Independent Component Analysis in high dimensions
Fabiola Ricci
Lorenzo Bardone
Sebastian Goldt
OOD
196
0
0
31 Mar 2025
Low-dimensional Functions are Efficiently Learnable under Randomly Biased Distributions
Elisabetta Cornacchia
Dan Mikulincer
Elchanan Mossel
115
1
0
10 Feb 2025
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Deep Linear Network Training Dynamics from Random Initialization: Data, Width, Depth, and Hyperparameter Transfer
Blake Bordelon
Cengiz Pehlevan
AI4CE
162
1
0
04 Feb 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
115
1
0
10 Jan 2025
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance
On the phase diagram of extensive-rank symmetric matrix denoising beyond rotational invariance
Jean Barbier
Francesco Camilli
Justin Ko
Koki Okajima
82
6
0
04 Nov 2024
How Feature Learning Can Improve Neural Scaling Laws
How Feature Learning Can Improve Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
93
16
0
26 Sep 2024
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
A Theory of Non-Linear Feature Learning with One Gradient Step in Two-Layer Neural Networks
Behrad Moniri
Donghwan Lee
Hamed Hassani
Yan Sun
MLT
84
22
0
11 Oct 2023
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample
  Complexity for Learning Single Index Models
Smoothing the Landscape Boosts the Signal for SGD: Optimal Sample Complexity for Learning Single Index Models
Alexandru Damian
Eshaan Nichani
Rong Ge
Jason D. Lee
MLT
77
39
0
18 May 2023
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean
  Field Neural Networks
Dynamics of Finite Width Kernel and Prediction Fluctuations in Mean Field Neural Networks
Blake Bordelon
Cengiz Pehlevan
MLT
70
31
0
06 Apr 2023
Learning time-scales in two-layers neural networks
Learning time-scales in two-layers neural networks
Raphael Berthier
Andrea Montanari
Kangjie Zhou
135
36
0
28 Feb 2023
SGD learning on neural networks: leap complexity and saddle-to-saddle
  dynamics
SGD learning on neural networks: leap complexity and saddle-to-saddle dynamics
Emmanuel Abbe
Enric Boix-Adserà
Theodor Misiakiewicz
FedMLMLT
138
86
0
21 Feb 2023
Universality laws for Gaussian mixtures in generalized linear models
Universality laws for Gaussian mixtures in generalized linear models
Yatin Dandi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
Lenka Zdeborová
FedML
72
21
0
17 Feb 2023
Precise Asymptotic Analysis of Deep Random Feature Models
Precise Asymptotic Analysis of Deep Random Feature Models
David Bosch
Ashkan Panahi
B. Hassibi
66
19
0
13 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A
  unifying approach to SGD in two-layers networks
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
79
33
0
12 Feb 2023
Deterministic equivalent and error universality of deep random features
  learning
Deterministic equivalent and error universality of deep random features learning
Dominik Schröder
Hugo Cui
Daniil Dmitriev
Bruno Loureiro
MLT
73
28
0
01 Feb 2023
Learning Single-Index Models with Shallow Neural Networks
Learning Single-Index Models with Shallow Neural Networks
A. Bietti
Joan Bruna
Clayton Sanford
M. Song
193
71
0
27 Oct 2022
Neural Networks can Learn Representations with Gradient Descent
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSLMLT
90
123
0
30 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
Matthieu Wyart
MLT
78
25
0
24 Jun 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical
  scaling
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
102
59
0
08 Jun 2022
Gradient flow dynamics of shallow ReLU networks for square loss and
  orthogonal inputs
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs
Etienne Boursier
Loucas Pillaud-Vivien
Nicolas Flammarion
ODL
51
61
0
02 Jun 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
87
128
0
03 May 2022
Universality of empirical risk minimization
Universality of empirical risk minimization
Andrea Montanari
Basil Saeed
OOD
63
78
0
17 Feb 2022
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics
  for Convex Losses in High-Dimension
Fluctuations, Bias, Variance & Ensemble of Learners: Exact Asymptotics for Convex Losses in High-Dimension
Bruno Loureiro
Cédric Gerbelot
Maria Refinetti
G. Sicuro
Florent Krzakala
72
27
0
31 Jan 2022
Separation of Scales and a Thermodynamic Description of Feature Learning
  in Some CNNs
Separation of Scales and a Thermodynamic Description of Feature Learning in Some CNNs
Inbar Seroussi
Gadi Naveh
Zohar Ringel
74
55
0
31 Dec 2021
Neural Networks as Kernel Learners: The Silent Alignment Effect
Neural Networks as Kernel Learners: The Silent Alignment Effect
Alexander B. Atanasov
Blake Bordelon
Cengiz Pehlevan
MLT
77
85
0
29 Oct 2021
The Eigenlearning Framework: A Conservation Law Perspective on Kernel
  Regression and Wide Neural Networks
The Eigenlearning Framework: A Conservation Law Perspective on Kernel Regression and Wide Neural Networks
James B. Simon
Madeline Dickens
Dhruva Karkada
M. DeWeese
74
28
0
08 Oct 2021
The staircase property: How hierarchical structure can guide deep
  learning
The staircase property: How hierarchical structure can guide deep learning
Emmanuel Abbe
Enric Boix-Adserà
Matthew Brennan
Guy Bresler
Dheeraj M. Nagaraj
51
55
0
24 Aug 2021
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization
  Training, Symmetry, and Sparsity
Saddle-to-Saddle Dynamics in Deep Linear Networks: Small Initialization Training, Symmetry, and Sparsity
Arthur Jacot
François Ged
Berfin cSimcsek
Clément Hongler
Franck Gabriel
58
55
0
30 Jun 2021
A self consistent theory of Gaussian Processes captures feature learning
  effects in finite CNNs
A self consistent theory of Gaussian Processes captures feature learning effects in finite CNNs
Gadi Naveh
Zohar Ringel
SSLMLT
72
32
0
08 Jun 2021
Generalization Error Rates in Kernel Regression: The Crossover from the
  Noiseless to Noisy Regime
Generalization Error Rates in Kernel Regression: The Crossover from the Noiseless to Noisy Regime
Hugo Cui
Bruno Loureiro
Florent Krzakala
Lenka Zdeborová
76
85
0
31 May 2021
How rotational invariance of common kernels prevents generalization in
  high dimensions
How rotational invariance of common kernels prevents generalization in high dimensions
Konstantin Donhauser
Mingqi Wu
Fanny Yang
76
24
0
09 Apr 2021
Learning curves of generic features maps for realistic datasets with a
  teacher-student model
Learning curves of generic features maps for realistic datasets with a teacher-student model
Bruno Loureiro
Cédric Gerbelot
Hugo Cui
Sebastian Goldt
Florent Krzakala
M. Mézard
Lenka Zdeborová
99
140
0
16 Feb 2021
Generalization error of random features and kernel methods:
  hypercontractivity and kernel matrix concentration
Generalization error of random features and kernel methods: hypercontractivity and kernel matrix concentration
Song Mei
Theodor Misiakiewicz
Andrea Montanari
84
112
0
26 Jan 2021
The Gaussian equivalence of generative models for learning with shallow
  neural networks
The Gaussian equivalence of generative models for learning with shallow neural networks
Sebastian Goldt
Bruno Loureiro
Galen Reeves
Florent Krzakala
M. Mézard
Lenka Zdeborová
BDL
85
107
0
25 Jun 2020
When Do Neural Networks Outperform Kernel Methods?
When Do Neural Networks Outperform Kernel Methods?
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
88
189
0
24 Jun 2020
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel
  Regression and Infinitely Wide Neural Networks
Spectral Bias and Task-Model Alignment Explain Generalization in Kernel Regression and Infinitely Wide Neural Networks
Abdulkadir Canatar
Blake Bordelon
Cengiz Pehlevan
97
189
0
23 Jun 2020
Online stochastic gradient descent on non-convex losses from
  high-dimensional inference
Online stochastic gradient descent on non-convex losses from high-dimensional inference
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
65
89
0
23 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
194
241
0
04 Mar 2020
Generalisation error in learning with random features and the hidden
  manifold model
Generalisation error in learning with random features and the hidden manifold model
Federica Gerace
Bruno Loureiro
Florent Krzakala
M. Mézard
Lenka Zdeborová
67
172
0
21 Feb 2020
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural
  Networks
Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural Networks
Blake Bordelon
Abdulkadir Canatar
Cengiz Pehlevan
225
208
0
07 Feb 2020
The generalization error of random features regression: Precise
  asymptotics and double descent curve
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
95
639
0
14 Aug 2019
Limitations of Lazy Training of Two-layers Neural Networks
Limitations of Lazy Training of Two-layers Neural Networks
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
55
143
0
21 Jun 2019
SGD on Neural Networks Learns Functions of Increasing Complexity
SGD on Neural Networks Learns Functions of Increasing Complexity
Preetum Nakkiran
Gal Kaplun
Dimitris Kalimeris
Tristan Yang
Benjamin L. Edelman
Fred Zhang
Boaz Barak
MLT
133
247
0
28 May 2019
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
111
839
0
19 Dec 2018
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Mean Field Analysis of Neural Networks: A Central Limit Theorem
Justin A. Sirignano
K. Spiliopoulos
MLT
75
194
0
28 Aug 2018
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
212
736
0
24 May 2018
Trainability and Accuracy of Neural Networks: An Interacting Particle
  System Approach
Trainability and Accuracy of Neural Networks: An Interacting Particle System Approach
Grant M. Rotskoff
Eric Vanden-Eijnden
114
123
0
02 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
98
861
0
18 Apr 2018
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
128
3,685
0
08 Jun 2017
Generalization Properties of Learning with Random Features
Generalization Properties of Learning with Random Features
Alessandro Rudi
Lorenzo Rosasco
MLT
68
331
0
14 Feb 2016
12
Next