ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.19353
  4. Cited By
A spring-block theory of feature learning in deep neural networks

A spring-block theory of feature learning in deep neural networks

28 July 2024
Chengzhi Shi
Liming Pan
Ivan Dokmanić
    AI4CE
ArXivPDFHTML

Papers citing "A spring-block theory of feature learning in deep neural networks"

49 / 49 papers shown
Title
The boundary of neural network trainability is fractal
The boundary of neural network trainability is fractal
Jascha Narain Sohl-Dickstein
52
8
0
09 Feb 2024
Asymptotics of feature learning in two-layer networks after one
  gradient-step
Asymptotics of feature learning in two-layer networks after one gradient-step
Hugo Cui
Luca Pesce
Yatin Dandi
Florent Krzakala
Yue M. Lu
Lenka Zdeborová
Bruno Loureiro
MLT
88
19
0
07 Feb 2024
The Benefits of Reusing Batches for Gradient Descent in Two-Layer
  Networks: Breaking the Curse of Information and Leap Exponents
The Benefits of Reusing Batches for Gradient Descent in Two-Layer Networks: Breaking the Curse of Information and Leap Exponents
Yatin Dandi
Emanuele Troiani
Luca Arnaboldi
Luca Pesce
Lenka Zdeborová
Florent Krzakala
MLT
78
29
0
05 Feb 2024
A Dynamical Model of Neural Scaling Laws
A Dynamical Model of Neural Scaling Laws
Blake Bordelon
Alexander B. Atanasov
Cengiz Pehlevan
84
41
0
02 Feb 2024
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network
  Training
Temperature Balancing, Layer-wise Weight Analysis, and Neural Network Training
Yefan Zhou
Tianyu Pang
Keqin Liu
Charles H. Martin
Michael W. Mahoney
Yaoqing Yang
95
11
0
01 Dec 2023
On the different regimes of Stochastic Gradient Descent
On the different regimes of Stochastic Gradient Descent
Antonio Sclocchi
Matthieu Wyart
49
20
0
19 Sep 2023
A Neural Collapse Perspective on Feature Evolution in Graph Neural
  Networks
A Neural Collapse Perspective on Feature Evolution in Graph Neural Networks
Vignesh Kothapalli
Tom Tirer
Joan Bruna
61
13
0
04 Jul 2023
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards
  Simpler Subnetworks
Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks
F. Chen
D. Kunin
Atsushi Yamamura
Surya Ganguli
80
27
0
07 Jun 2023
The Law of Parsimony in Gradient Descent for Learning Deep Linear
  Networks
The Law of Parsimony in Gradient Descent for Learning Deep Linear Networks
Can Yaras
Peng Wang
Wei Hu
Zhihui Zhu
Laura Balzano
Qing Qu
63
18
0
01 Jun 2023
A Rainbow in Deep Network Black Boxes
A Rainbow in Deep Network Black Boxes
Florentin Guth
Brice Ménard
G. Rochette
S. Mallat
74
11
0
29 May 2023
Stochastic Modified Equations and Dynamics of Dropout Algorithm
Stochastic Modified Equations and Dynamics of Dropout Algorithm
Zhongwang Zhang
Yuqing Li
Yaoyu Zhang
Z. Xu
41
9
0
25 May 2023
Phase transitions in the mini-batch size for sparse and dense two-layer
  neural networks
Phase transitions in the mini-batch size for sparse and dense two-layer neural networks
Raffaele Marino
F. Ricci-Tersenghi
52
15
0
10 May 2023
Inversion dynamics of class manifolds in deep learning reveals tradeoffs
  underlying generalisation
Inversion dynamics of class manifolds in deep learning reveals tradeoffs underlying generalisation
Simone Ciceri
Lorenzo Cassani
Matteo Osella
P. Rotondo
P. Pizzochero
M. Gherardi
54
7
0
09 Mar 2023
Injectivity of ReLU networks: perspectives from statistical physics
Injectivity of ReLU networks: perspectives from statistical physics
Antoine Maillard
Afonso S. Bandeira
David Belius
Ivan Dokmanić
S. Nakajima
44
5
0
27 Feb 2023
From high-dimensional & mean-field dynamics to dimensionless ODEs: A
  unifying approach to SGD in two-layers networks
From high-dimensional & mean-field dynamics to dimensionless ODEs: A unifying approach to SGD in two-layers networks
Luca Arnaboldi
Ludovic Stephan
Florent Krzakala
Bruno Loureiro
MLT
62
33
0
12 Feb 2023
Homophily modulates double descent generalization in graph convolution
  networks
Homophily modulates double descent generalization in graph convolution networks
Chengzhi Shi
Liming Pan
Hong Hu
Ivan Dokmanić
53
9
0
26 Dec 2022
A Law of Data Separation in Deep Learning
A Law of Data Separation in Deep Learning
Hangfeng He
Weijie J. Su
OOD
71
41
0
31 Oct 2022
High-dimensional limit theorems for SGD: Effective dynamics and critical
  scaling
High-dimensional limit theorems for SGD: Effective dynamics and critical scaling
Gerard Ben Arous
Reza Gheissari
Aukosh Jagannath
89
58
0
08 Jun 2022
Stochastic gradient descent introduces an effective landscape-dependent
  regularization favoring flat solutions
Stochastic gradient descent introduces an effective landscape-dependent regularization favoring flat solutions
Ning Yang
Chao Tang
Yuhai Tu
MLT
27
21
0
02 Jun 2022
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide
  Neural Networks
Self-Consistent Dynamical Field Theory of Kernel Evolution in Wide Neural Networks
Blake Bordelon
Cengiz Pehlevan
MLT
56
82
0
19 May 2022
Feature Learning and Signal Propagation in Deep Neural Networks
Feature Learning and Signal Propagation in Deep Neural Networks
Yizhang Lou
Chris Mingard
Yoonsoo Nam
Soufiane Hayou
MDE
58
18
0
22 Oct 2021
Unveiling the structure of wide flat minima in neural networks
Unveiling the structure of wide flat minima in neural networks
Carlo Baldassi
Clarissa Lauditi
Enrico M. Malatesta
Gabriele Perugini
R. Zecchina
51
34
0
02 Jul 2021
Label Noise SGD Provably Prefers Flat Global Minimizers
Label Noise SGD Provably Prefers Flat Global Minimizers
Alexandru Damian
Tengyu Ma
Jason D. Lee
NoLa
97
119
0
11 Jun 2021
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse
  in Imbalanced Training
Exploring Deep Neural Networks via Layer-Peeled Model: Minority Collapse in Imbalanced Training
Cong Fang
Hangfeng He
Qi Long
Weijie J. Su
FAtt
147
170
0
29 Jan 2021
Statistical Mechanics of Deep Linear Neural Networks: The
  Back-Propagating Kernel Renormalization
Statistical Mechanics of Deep Linear Neural Networks: The Back-Propagating Kernel Renormalization
Qianyi Li
H. Sompolinsky
131
72
0
07 Dec 2020
Prevalence of Neural Collapse during the terminal phase of deep learning
  training
Prevalence of Neural Collapse during the terminal phase of deep learning training
Vardan Papyan
Xuemei Han
D. Donoho
184
574
0
18 Aug 2020
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Phase diagram for two-layer ReLU neural networks at infinite-width limit
Yaoyu Zhang
Zhi-Qin John Xu
Zheng Ma
Yaoyu Zhang
50
61
0
15 Jul 2020
What Do Neural Networks Learn When Trained With Random Labels?
What Do Neural Networks Learn When Trained With Random Labels?
Hartmut Maennel
Ibrahim Alabdulmohsin
Ilya O. Tolstikhin
R. Baldock
Olivier Bousquet
Sylvain Gelly
Daniel Keysers
FedML
140
89
0
18 Jun 2020
The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural
  Networks: an Exact Characterization of the Optimal Solutions
The Hidden Convex Optimization Landscape of Two-Layer ReLU Neural Networks: an Exact Characterization of the Optimal Solutions
Yifei Wang
Jonathan Lacotte
Mert Pilanci
MLT
50
27
0
10 Jun 2020
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
209
922
0
26 Apr 2019
On Lazy Training in Differentiable Programming
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
102
833
0
19 Dec 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
110
290
0
04 Oct 2018
Geometry of energy landscapes and the optimizability of deep neural
  networks
Geometry of energy landscapes and the optimizability of deep neural networks
Simon Becker
Yao Zhang
A. Lee
32
30
0
01 Aug 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
252
3,194
0
20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
200
735
0
24 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
81
858
0
18 Apr 2018
On the Optimization of Deep Networks: Implicit Acceleration by
  Overparameterization
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
Sanjeev Arora
Nadav Cohen
Elad Hazan
97
483
0
19 Feb 2018
Three Factors Influencing Minima in SGD
Three Factors Influencing Minima in SGD
Stanislaw Jastrzebski
Zachary Kenton
Devansh Arpit
Nicolas Ballas
Asja Fischer
Yoshua Bengio
Amos Storkey
76
463
0
13 Nov 2017
Rethinking generalization requires revisiting old ideas: statistical
  mechanics approaches and complex learning behavior
Rethinking generalization requires revisiting old ideas: statistical mechanics approaches and complex learning behavior
Charles H. Martin
Michael W. Mahoney
AI4CE
47
64
0
26 Oct 2017
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
A Bayesian Perspective on Generalization and Stochastic Gradient Descent
Samuel L. Smith
Quoc V. Le
BDL
61
251
0
17 Oct 2017
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour
Priya Goyal
Piotr Dollár
Ross B. Girshick
P. Noordhuis
Lukasz Wesolowski
Aapo Kyrola
Andrew Tulloch
Yangqing Jia
Kaiming He
3DH
120
3,675
0
08 Jun 2017
Exponential expressivity in deep neural networks through transient chaos
Exponential expressivity in deep neural networks through transient chaos
Ben Poole
Subhaneil Lahiri
M. Raghu
Jascha Narain Sohl-Dickstein
Surya Ganguli
88
591
0
16 Jun 2016
Toward Deeper Understanding of Neural Networks: The Power of
  Initialization and a Dual View on Expressivity
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity
Amit Daniely
Roy Frostig
Y. Singer
156
343
0
18 Feb 2016
Stochastic modified equations and adaptive stochastic gradient
  algorithms
Stochastic modified equations and adaptive stochastic gradient algorithms
Qianxiao Li
Cheng Tai
E. Weinan
59
284
0
19 Nov 2015
Batch Normalization: Accelerating Deep Network Training by Reducing
  Internal Covariate Shift
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe
Christian Szegedy
OOD
439
43,277
0
11 Feb 2015
Delving Deep into Rectifiers: Surpassing Human-Level Performance on
  ImageNet Classification
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He
Xinming Zhang
Shaoqing Ren
Jian Sun
VLM
298
18,587
0
06 Feb 2015
Adam: A Method for Stochastic Optimization
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
1.6K
150,006
0
22 Dec 2014
Training Convolutional Networks with Noisy Labels
Training Convolutional Networks with Noisy Labels
Sainbayar Sukhbaatar
Joan Bruna
Manohar Paluri
Lubomir D. Bourdev
Rob Fergus
NoLa
89
272
0
09 Jun 2014
Visualizing and Understanding Convolutional Networks
Visualizing and Understanding Convolutional Networks
Matthew D. Zeiler
Rob Fergus
FAtt
SSL
563
15,874
0
12 Nov 2013
1