Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks

20 September 2021

Papers citing "Deformed semicircle law and concentration of nonlinear random matrices for ultra-wide neural networks"

50 / 55 papers shown

Title
Overparameterized random feature regression with nearly orthogonal data Zhichao Wang Yizhe Zhu 52 4 0 11 Nov 2022
$Asymptotic normality for eigenvalue statistics of a general sample covariance matrix when $p/n \to \infty$ and applications$ Asymptotic normality for eigenvalue statistics of a general sample covariance matrix when $p/n \to \infty$ and applications Jiaxin Qiu Zeng Li Jianfeng Yao 42 9 0 14 Sep 2021
Testing Kronecker Product Covariance Matrices for High-dimensional Matrix-Variate Data Long Yu Jiahui Xie Wang Zhou 30 3 0 27 May 2021
Analysis of One-Hidden-Layer Neural Networks via the Resolvent Method Vanessa Piccolo Dominik Schröder 35 8 0 11 May 2021
Spiked Singular Values and Vectors under Extreme Aspect Ratios M. Feldman 37 9 0 30 Apr 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case B. Collins Tomohiro Hayase 49 8 0 24 Mar 2021
Deep learning: a statistical viewpoint Peter L. Bartlett Andrea Montanari Alexander Rakhlin 55 276 0 16 Mar 2021
Exact Gap between Generalization Error and Uniform Convergence in Random Feature Models Zitong Yang Yu Bai Song Mei 42 17 0 08 Mar 2021
Learning curves of generic features maps for realistic datasets with a teacher-student model Bruno Loureiro Cédric Gerbelot Hugo Cui Sebastian Goldt Florent Krzakala M. Mézard Lenka Zdeborová 95 138 0 16 Feb 2021
Generalization error of random features and kernel methods: hypercontractivity and kernel matrix concentration Song Mei Theodor Misiakiewicz Andrea Montanari 81 111 0 26 Jan 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 90 48 0 24 Jan 2021
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks Quynh N. Nguyen Marco Mondelli Guido Montúfar 49 82 0 21 Dec 2020
What causes the test error? Going beyond bias-variance via ANOVA Licong Lin Yan Sun 47 34 0 11 Oct 2020
Kernel regression in high dimensions: Refined analysis beyond double descent Fanghui Liu Zhenyu Liao Johan A. K. Suykens 46 50 0 06 Oct 2020
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization Ben Adlam Jeffrey Pennington 49 125 0 15 Aug 2020
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training Andrea Montanari Yiqiao Zhong 129 96 0 25 Jul 2020
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks Wei Hu Lechao Xiao Ben Adlam Jeffrey Pennington 56 63 0 25 Jun 2020
Learning Rates as a Function of Batch Size: A Random Matrix Theory Approach to Neural Network Training Diego Granziol S. Zohren Stephen J. Roberts ODL 77 49 0 16 Jun 2020
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry Tomohiro Hayase Ryo Karakida 50 7 0 14 Jun 2020
A Random Matrix Analysis of Random Fourier Features: Beyond the Gaussian Kernel, a Precise Phase Transition, and the Corresponding Double Descent Zhenyu Liao Romain Couillet Michael W. Mahoney 74 90 0 09 Jun 2020
Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks Z. Fan Zhichao Wang 98 73 0 25 May 2020
Generalisation error in learning with random features and the hidden manifold model Federica Gerace Bruno Loureiro Florent Krzakala M. Mézard Lenka Zdeborová 64 169 0 21 Feb 2020
Implicit Regularization of Random Feature Models Arthur Jacot Berfin Simsek Francesco Spadaro Clément Hongler Franck Gabriel 59 83 0 19 Feb 2020
Global Convergence of Deep Networks with One Wide Layer Followed by Pyramidal Topology Quynh N. Nguyen Marco Mondelli ODL AI4CE 41 68 0 18 Feb 2020
Stationary Points of Shallow Neural Networks with Quadratic Activation Function D. Gamarnik Eren C. Kizildag Ilias Zadik 33 13 0 03 Dec 2019
A Random Matrix Perspective on Mixtures of Nonlinearities for Deep Learning Ben Adlam J. Levinson Jeffrey Pennington 57 25 0 02 Dec 2019
The generalization error of random features regression: Precise asymptotics and double descent curve Song Mei Andrea Montanari 83 635 0 14 Aug 2019
Limitations of Lazy Training of Two-layers Neural Networks Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 55 143 0 21 Jun 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound Zhao Song Xin Yang 60 91 0 09 Jun 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 218 923 0 26 Apr 2019
Eigenvalue distribution of nonlinear models of random matrices L. Benigni Sandrine Péché 57 27 0 05 Apr 2019
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network Xiaoxia Wu S. Du Rachel A. Ward 72 65 0 19 Feb 2019
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks Samet Oymak Mahdi Soltanolkotabi 48 321 0 12 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruosong Wang MLT 195 972 0 24 Jan 2019
On Lazy Training in Differentiable Programming Lénaïc Chizat Edouard Oyallon Francis R. Bach 106 833 0 19 Dec 2018
A Convergence Theory for Deep Learning via Over-Parameterization Zeyuan Allen-Zhu Yuanzhi Li Zhao Song AI4CE ODL 242 1,462 0 09 Nov 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 214 1,272 0 04 Oct 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize Tengyuan Liang Alexander Rakhlin 60 353 0 01 Aug 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks Arthur Jacot Franck Gabriel Clément Hongler 267 3,195 0 20 Jun 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 301 353 0 14 Jun 2018
On the Spectrum of Random Features Maps of High Dimensional Data Zhenyu Liao Romain Couillet 49 51 0 30 May 2018
Gaussian Process Behaviour in Wide Deep Neural Networks A. G. Matthews Mark Rowland Jiri Hron Richard Turner Zoubin Ghahramani BDL 144 559 0 30 Apr 2018
Random Fourier Features for Kernel Ridge Regression: Approximation Bounds and Statistical Guarantees H. Avron Michael Kapralov Cameron Musco Christopher Musco A. Velingker A. Zandieh 58 156 0 26 Apr 2018
The Emergence of Spectral Universality in Deep Networks Jeffrey Pennington S. Schoenholz Surya Ganguli 58 173 0 27 Feb 2018
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice Jeffrey Pennington S. Schoenholz Surya Ganguli ODL 43 252 0 13 Nov 2017
Deep Neural Networks as Gaussian Processes Jaehoon Lee Yasaman Bahri Roman Novak S. Schoenholz Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 118 1,093 0 01 Nov 2017
A Random Matrix Approach to Neural Networks Cosme Louart Zhenyu Liao Romain Couillet 65 161 0 17 Feb 2017
Deep Information Propagation S. Schoenholz Justin Gilmer Surya Ganguli Jascha Narain Sohl-Dickstein 79 367 0 04 Nov 2016
Exponential expressivity in deep neural networks through transient chaos Ben Poole Subhaneil Lahiri M. Raghu Jascha Narain Sohl-Dickstein Surya Ganguli 88 591 0 16 Jun 2016
Toward Deeper Understanding of Neural Networks: The Power of Initialization and a Dual View on Expressivity Amit Daniely Roy Frostig Y. Singer 160 343 0 18 Feb 2016