Tensor Programs III: Neural Matrix Laws

22 September 2020

Papers citing "Tensor Programs III: Neural Matrix Laws"

34 / 34 papers shown

Title
Global Convergence and Rich Feature Learning in $L$ -Layer Infinite-Width Neural Networks under $μ$ P Parametrization Zixiang Chen Greg Yang Qingyue Zhao Q. Gu MLT 50 0 0 12 Mar 2025
Function-Space Learning Rates Edward Milsom Ben Anson Laurence Aitchison 67 1 0 24 Feb 2025
$u-$\mu$P: The Unit-Scaled Maximal Update Parametrization$ u- $\mu$ P: The Unit-Scaled Maximal Update Parametrization Charlie Blake C. Eichenberg Josef Dean Lukas Balles Luke Y. Prince Bjorn Deiseroth Andres Felipe Cruz Salinas Carlo Luschi Samuel Weinbach Douglas Orr 55 9 0 24 Jul 2024
The Impact of Initialization on LoRA Finetuning Dynamics Soufiane Hayou Nikhil Ghosh Bin Yu AI4CE 36 11 0 12 Jun 2024
$μ$ LO: Compute-Efficient Meta-Generalization of Learned Optimizers Benjamin Thérien Charles-Étienne Joseph Boris Knyazev Edouard Oyallon Irina Rish Eugene Belilovsky AI4CE 40 1 0 31 May 2024
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation Josh Alman Zhao-quan Song 35 31 0 06 Oct 2023
Commutative Width and Depth Scaling in Deep Neural Networks Soufiane Hayou 43 2 0 02 Oct 2023
Width and Depth Limits Commute in Residual Networks Soufiane Hayou Greg Yang 42 14 0 01 Feb 2023
A Kernel-Based View of Language Model Fine-Tuning Sadhika Malladi Alexander Wettig Dingli Yu Danqi Chen Sanjeev Arora VLM 68 60 0 11 Oct 2022
On the infinite-depth limit of finite-width neural networks Soufiane Hayou 27 22 0 03 Oct 2022
Neural Tangent Kernel: A Survey Eugene Golikov Eduard Pokonechnyy Vladimir Korviakov 27 12 0 29 Aug 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling Jiri Hron Roman Novak Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 48 6 0 15 Jun 2022
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks Huishuai Zhang Da Yu Yiping Lu Di He AAML 27 1 0 09 Jun 2022
Global Convergence of Over-parameterized Deep Equilibrium Models Zenan Ling Xingyu Xie Qiuhao Wang Zongpeng Zhang Zhouchen Lin 32 12 0 27 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation Jimmy Ba Murat A. Erdogdu Taiji Suzuki Zhichao Wang Denny Wu Greg Yang MLT 40 121 0 03 May 2022
Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer Greg Yang J. E. Hu Igor Babuschkin Szymon Sidor Xiaodong Liu David Farhi Nick Ryder J. Pachocki Weizhu Chen Jianfeng Gao 26 148 0 07 Mar 2022
A duality connecting neural network and cosmological dynamics Sven Krippendorf M. Spannowsky AI4CE 30 8 0 22 Feb 2022
Eigenvalue Distribution of Large Random Matrices Arising in Deep Neural Networks: Orthogonal Case L. Pastur 19 5 0 12 Jan 2022
Feature Learning and Signal Propagation in Deep Neural Networks Yizhang Lou Chris Mingard Yoonsoo Nam Soufiane Hayou MDE 24 17 0 22 Oct 2021
Nonperturbative renormalization for the neural network-QFT correspondence Harold Erbin Vincent Lahoche D. O. Samary 41 30 0 03 Aug 2021
Random Neural Networks in the Infinite Width Limit as Gaussian Processes Boris Hanin BDL 32 43 0 04 Jul 2021
Implicit Acceleration and Feature Learning in Infinitely Wide Neural Networks with Bottlenecks Etai Littwin Omid Saremi Shuangfei Zhai Vimal Thilak Hanlin Goh J. Susskind Greg Yang 25 3 0 01 Jul 2021
Regularization in ResNet with Stochastic Depth Soufiane Hayou Fadhel Ayed 17 10 0 06 Jun 2021
Priors in Bayesian Deep Learning: A Review Vincent Fortuin UQCV BDL 31 124 0 14 May 2021
Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training Dynamics Greg Yang Etai Littwin 9 64 0 08 May 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case B. Collins Tomohiro Hayase 22 7 0 24 Mar 2021
Feature Learning in Infinite-Width Neural Networks Greg Yang J. E. Hu MLT 9 147 0 30 Nov 2020
On Random Matrices Arising in Deep Neural Networks: General I.I.D. Case L. Pastur V. Slavin CML 24 12 0 20 Nov 2020
Stable ResNet Soufiane Hayou Eugenio Clerico Bo He George Deligiannidis Arnaud Doucet Judith Rousseau ODL SSeg 46 51 0 24 Oct 2020
Neural Networks and Quantum Field Theory James Halverson Anindita Maiti Keegan Stoner 8 75 0 19 Aug 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture Greg Yang 55 134 0 25 Jun 2020
The Recurrent Neural Tangent Kernel Sina Alemohammad Zichao Wang Randall Balestriero Richard Baraniuk AAML 6 77 0 18 Jun 2020
Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit Soufiane Hayou Arnaud Doucet Judith Rousseau 16 4 0 31 May 2019
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 227 348 0 14 Jun 2018