Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
Descent

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

18 February 2019

Jascha Narain Sohl-Dickstein

Jeffrey Pennington

Papers citing "Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent"

11 / 261 papers shown

Title
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems Tianle Cai Ruiqi Gao Jikai Hou Siyu Chen Dong Wang Di He Zhihua Zhang Liwei Wang ODL 21 57 0 28 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu Yuanzhi Li 24 183 0 24 May 2019
A type of generalization error induced by initialization in deep neural networks Yaoyu Zhang Zhi-Qin John Xu Tao Luo Zheng Ma 9 49 0 19 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation Colin Wei Tengyu Ma 25 109 0 09 May 2019
Linearized two-layers neural networks in high dimension Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 18 241 0 27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 44 901 0 26 Apr 2019
An Empirical Study of Large-Batch Stochastic Gradient Descent with Structured Covariance Noise Yeming Wen Kevin Luk Maxime Gazeau Guodong Zhang Harris Chan Jimmy Ba ODL 20 22 0 21 Feb 2019
Scaling description of generalization with number of parameters in deep learning Mario Geiger Arthur Jacot S. Spigler Franck Gabriel Levent Sagun Stéphane dÁscoli Giulio Biroli Clément Hongler M. Wyart 52 195 0 06 Jan 2019
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 233 348 0 14 Jun 2018
High-dimensional dynamics of generalization error in neural networks Madhu S. Advani Andrew M. Saxe AI4CE 78 464 0 10 Oct 2017
A Differential Equation for Modeling Nesterov's Accelerated Gradient Method: Theory and Insights Weijie Su Stephen P. Boyd Emmanuel J. Candes 108 1,154 0 04 Mar 2015