Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.19353
Cited By
A spring-block theory of feature learning in deep neural networks
28 July 2024
Chengzhi Shi
Liming Pan
Ivan Dokmanić
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"A spring-block theory of feature learning in deep neural networks"
49 / 49 papers shown
Title
The boundary of neural network trainability is fractal
Jascha Narain Sohl-Dickstein
52
8
0
09 Feb 2024
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
88
19
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
78
29
0
05 Feb 2024
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
84
41
0
02 Feb 2024
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou
Tianyu Pang
Keqin Liu
Charles H. Martin
Michael W. Mahoney
Yaoqing Yang
95
11
0
01 Dec 2023
On the different regimes of Stochastic Gradient Descent
Antonio Sclocchi
Matthieu Wyart
49
20
0
19 Sep 2023
A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks
Vignesh Kothapalli
Tom Tirer
Joan Bruna
61
13
0
04 Jul 2023
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
F. Chen
D. Kunin
Atsushi Yamamura
Surya Ganguli
80
27
0
07 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
63
18
0
01 Jun 2023
A Rainbow in Deep Network Black Boxes
Florentin Guth
Brice Ménard
G. Rochette
S. Mallat
74
11
0
29 May 2023
Stochastic Modified Equations and Dynamics of Dropout Algorithm
Zhongwang Zhang
Yuqing Li
Yaoyu Zhang
Z. Xu
41
9
0
25 May 2023
Phase transitions in the mini-batch size for sparse and dense two-layer neural networks
Raffaele Marino
F. Ricci-Tersenghi
52
15
0
10 May 2023
Inversion dynamics of class manifolds in deep learning reveals tradeoffs underlying generalisation
Simone Ciceri
Lorenzo Cassani
Matteo Osella
P. Rotondo
P. Pizzochero
M. Gherardi
54
7
0
09 Mar 2023
Injectivity of ReLU networks: perspectives from statistical physics
Antoine Maillard
Afonso S. Bandeira
David Belius
Ivan Dokmanić
S. Nakajima
44
5
0
27 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
62
33
0
12 Feb 2023
Homophily modulates double descent generalization in graph convolution networks
Chengzhi Shi
Liming Pan
Hong Hu
Ivan Dokmanić
53
9
0
26 Dec 2022
A Law of Data Separation in Deep Learning
Hangfeng He
Weijie J. Su
OOD
71
41
0
31 Oct 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
89
58
0
08 Jun 2022
Stochastic gradient descent introduces an effective landscape-dependent regularization favoring flat solutions
Ning Yang
Chao Tang
Yuhai Tu
MLT
27
21
0
02 Jun 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
Cengiz Pehlevan
MLT
56
82
0
19 May 2022
Feature Learning and Signal Propagation in Deep Neural Networks
Yizhang Lou
Chris Mingard
Yoonsoo Nam
Soufiane Hayou
MDE
58
18
0
22 Oct 2021
Unveiling the structure of wide flat minima in neural networks
Carlo Baldassi
Clarissa Lauditi
Enrico M. Malatesta
Gabriele Perugini
R. Zecchina
51
34
0
02 Jul 2021
Label Noise SGD Provably Prefers Flat Global Minimizers
Alexandru Damian
Tengyu Ma
Jason D. Lee
NoLa
97
119
0
11 Jun 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
147
170
0
29 Jan 2021
Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Kernel Renormalization
Qianyi Li
H. Sompolinsky
131
72
0
07 Dec 2020
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
184
574
0
18 Aug 2020
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Yaoyu Zhang
Zhi-Qin John Xu
Zheng Ma
Yaoyu Zhang
50
61
0
15 Jul 2020
What Do Neural Networks Learn When Trained With Random Labels?
Hartmut Maennel
Ibrahim Alabdulmohsin
Ilya O. Tolstikhin
R. Baldock
Olivier Bousquet
Sylvain Gelly
Daniel Keysers
FedML
140
89
0
18 Jun 2020
The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural Networks: an Exact Characterization of the Optimal Solutions
Yifei Wang
Jonathan Lacotte
Mert Pilanci
MLT
50
27
0
10 Jun 2020
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
209
922
0
26 Apr 2019
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
102
833
0
19 Dec 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
110
290
0
04 Oct 2018
Geometry of energy landscapes and the optimizability of deep neural networks
Simon Becker
Yao Zhang
A. Lee
32
30
0
01 Aug 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
252
3,194
0
20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
200
735
0
24 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
81
858
0
18 Apr 2018
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
Sanjeev Arora
Nadav Cohen
Elad Hazan
97
483
0
19 Feb 2018
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
76
463
0
13 Nov 2017
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
Charles H. Martin
Michael W. Mahoney
AI4CE
47
64
0
26 Oct 2017
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Samuel L. Smith
Quoc V. Le
BDL
61
251
0
17 Oct 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
120
3,675
0
08 Jun 2017
Exponential expressivity in deep neural networks through transient chaos
Ben Poole
Subhaneil Lahiri
M. Raghu
Jascha Narain Sohl-Dickstein
Surya Ganguli
88
591
0
16 Jun 2016
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
Amit Daniely
Roy Frostig
Y. Singer
156
343
0
18 Feb 2016
Stochastic modified equations and adaptive stochastic gradient algorithms
Qianxiao Li
Cheng Tai
E. Weinan
59
284
0
19 Nov 2015
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
439
43,277
0
11 Feb 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
298
18,587
0
06 Feb 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
150,006
0
22 Dec 2014
Training Convolutional Networks with Noisy Labels
Sainbayar Sukhbaatar
Joan Bruna
Manohar Paluri
Lubomir D. Bourdev
Rob Fergus
NoLa
89
272
0
09 Jun 2014
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler
Rob Fergus
FAtt
SSL
563
15,874
0
12 Nov 2013
1