Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2009.10685
Cited By
Tensor Programs III: Neural Matrix Laws
22 September 2020
Greg Yang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Tensor Programs III: Neural Matrix Laws"
34 / 34 papers shown
Title
Global Convergence and Rich Feature Learning in
L
L
L
-Layer Infinite-Width Neural Networks under
μ
μ
μ
P Parametrization
Zixiang Chen
Greg Yang
Qingyue Zhao
Q. Gu
MLT
50
0
0
12 Mar 2025
Function-Space Learning Rates
Edward Milsom
Ben Anson
Laurence Aitchison
67
1
0
24 Feb 2025
u-
μ
\mu
μ
P: The Unit-Scaled Maximal Update Parametrization
Charlie Blake
C. Eichenberg
Josef Dean
Lukas Balles
Luke Y. Prince
Bjorn Deiseroth
Andres Felipe Cruz Salinas
Carlo Luschi
Samuel Weinbach
Douglas Orr
55
9
0
24 Jul 2024
The Impact of Initialization on LoRA Finetuning Dynamics
Soufiane Hayou
Nikhil Ghosh
Bin Yu
AI4CE
36
11
0
12 Jun 2024
μ
μ
μ
LO: Compute-Efficient Meta-Generalization of Learned Optimizers
Benjamin Thérien
Charles-Étienne Joseph
Boris Knyazev
Edouard Oyallon
Irina Rish
Eugene Belilovsky
AI4CE
40
1
0
31 May 2024
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation
Josh Alman
Zhao-quan Song
35
31
0
06 Oct 2023
Commutative Width and Depth Scaling in Deep Neural Networks
Soufiane Hayou
43
2
0
02 Oct 2023
Width and Depth Limits Commute in Residual Networks
Soufiane Hayou
Greg Yang
42
14
0
01 Feb 2023
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
68
60
0
11 Oct 2022
On the infinite-depth limit of finite-width neural networks
Soufiane Hayou
27
22
0
03 Oct 2022
Neural Tangent Kernel: A Survey
Eugene Golikov
Eduard Pokonechnyy
Vladimir Korviakov
27
12
0
29 Aug 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Jiri Hron
Roman Novak
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
48
6
0
15 Jun 2022
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks
Huishuai Zhang
Da Yu
Yiping Lu
Di He
AAML
27
1
0
09 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
32
12
0
27 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
40
121
0
03 May 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
Greg Yang
J. E. Hu
Igor Babuschkin
Szymon Sidor
Xiaodong Liu
David Farhi
Nick Ryder
J. Pachocki
Weizhu Chen
Jianfeng Gao
26
148
0
07 Mar 2022
A duality connecting neural network and cosmological dynamics
Sven Krippendorf
M. Spannowsky
AI4CE
30
8
0
22 Feb 2022
Eigenvalue Distribution of Large Random Matrices Arising in Deep Neural Networks: Orthogonal Case
L. Pastur
19
5
0
12 Jan 2022
Feature Learning and Signal Propagation in Deep Neural Networks
Yizhang Lou
Chris Mingard
Yoonsoo Nam
Soufiane Hayou
MDE
24
17
0
22 Oct 2021
Nonperturbative renormalization for the neural network-QFT correspondence
Harold Erbin
Vincent Lahoche
D. O. Samary
41
30
0
03 Aug 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes
Boris Hanin
BDL
32
43
0
04 Jul 2021
Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks
Etai Littwin
Omid Saremi
Shuangfei Zhai
Vimal Thilak
Hanlin Goh
J. Susskind
Greg Yang
25
3
0
01 Jul 2021
Regularization in ResNet with Stochastic Depth
Soufiane Hayou
Fadhel Ayed
17
10
0
06 Jun 2021
Priors in Bayesian Deep Learning: A Review
Vincent Fortuin
UQCV
BDL
31
124
0
14 May 2021
Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training Dynamics
Greg Yang
Etai Littwin
9
64
0
08 May 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
B. Collins
Tomohiro Hayase
22
7
0
24 Mar 2021
Feature Learning in Infinite-Width Neural Networks
Greg Yang
J. E. Hu
MLT
9
147
0
30 Nov 2020
On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case
L. Pastur
V. Slavin
CML
24
12
0
20 Nov 2020
Stable ResNet
Soufiane Hayou
Eugenio Clerico
Bo He
George Deligiannidis
Arnaud Doucet
Judith Rousseau
ODL
SSeg
46
51
0
24 Oct 2020
Neural Networks and Quantum Field Theory
James Halverson
Anindita Maiti
Keegan Stoner
8
75
0
19 Aug 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
55
134
0
25 Jun 2020
The Recurrent Neural Tangent Kernel
Sina Alemohammad
Zichao Wang
Randall Balestriero
Richard Baraniuk
AAML
6
77
0
18 Jun 2020
Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit
Soufiane Hayou
Arnaud Doucet
Judith Rousseau
16
4
0
31 May 2019
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
227
348
0
14 Jun 2018
1