v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown

Title
Exact Convergence Rates of the Neural Tangent Kernel in the Large Depth Limit Soufiane Hayou Arnaud Doucet Judith Rousseau 106 4 0 31 May 2019
What Can Neural Networks Reason About? Keyulu Xu Jingling Li Mozhi Zhang S. Du Ken-ichi Kawarabayashi Stefanie Jegelka NAI AI4CE 110 248 0 30 May 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks Yuan Cao Quanquan Gu MLT AI4CE 131 392 0 30 May 2019
Graph Neural Tangent Kernel: Fusing Graph Neural Networks with Graph Kernels S. Du Kangcheng Hou Barnabás Póczós Ruslan Salakhutdinov Ruosong Wang Keyulu Xu 142 276 0 30 May 2019
Generalization bounds for deep convolutional neural networks Philip M. Long Hanie Sedghi MLT 136 90 0 29 May 2019
Norm-based generalisation bounds for multi-class convolutional neural networks Antoine Ledent Waleed Mustafa Yunwen Lei Marius Kloft 66 5 0 29 May 2019
On the Inductive Bias of Neural Tangent Kernels A. Bietti Julien Mairal 128 260 0 29 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems Tianle Cai Ruiqi Gao Jikai Hou Siyu Chen Dong Wang Di He Zhihua Zhang Liwei Wang ODL 76 57 0 28 May 2019
Simple and Effective Regularization Methods for Training on Noisily Labeled Data with Generalization Guarantee Wei Hu Zhiyuan Li Dingli Yu NoLa 113 12 0 27 May 2019
Fast Convergence of Natural Gradient Descent for Overparameterized Neural Networks Guodong Zhang James Martens Roger C. Grosse ODL 113 126 0 27 May 2019
Temporal-difference learning with nonlinear function approximation: lazy training and mean field regimes Andrea Agazzi Jianfeng Lu 98 8 0 27 May 2019
On Learning Over-parameterized Neural Networks: A Functional Approximation Perspective Lili Su Pengkun Yang MLT 80 54 0 26 May 2019
Enhancing Adversarial Defense by k-Winners-Take-All Chang Xiao Peilin Zhong Changxi Zheng AAML 80 99 0 25 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels? Zeyuan Allen-Zhu Yuanzhi Li 418 183 0 24 May 2019
On the Learning Dynamics of Two-layer Nonlinear Convolutional Neural Networks Ting Yu Junzhao Zhang Zhanxing Zhu MLT 44 5 0 24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems Atsushi Nitanda Geoffrey Chinot Taiji Suzuki MLT 105 34 0 23 May 2019
A type of generalization error induced by initialization in deep neural networks Yaoyu Zhang Zhi-Qin John Xu Yaoyu Zhang Zheng Ma 128 51 0 19 May 2019
An Essay on Optimization Mystery of Deep Learning Eugene Golikov ODL 30 0 0 17 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation Colin Wei Tengyu Ma 87 110 0 09 May 2019
Rethinking Arithmetic for Deep Neural Networks George A. Constantinides 64 4 0 07 May 2019
Linearized two-layers neural networks in high dimension Behrooz Ghorbani Song Mei Theodor Misiakiewicz Andrea Montanari MLT 97 243 0 27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 294 928 0 26 Apr 2019
The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent Karthik A. Sankararaman Soham De Zheng Xu Wenjie Huang Tom Goldstein ODL 120 106 0 15 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections E. Weinan Chao Ma Qingcan Wang Lei Wu MLT 108 22 0 10 Apr 2019
A Comparative Analysis of the Optimization and Generalization Property of Two-layer Neural Network and Random Feature Models Under Gradient Descent Dynamics E. Weinan Chao Ma Lei Wu MLT 85 124 0 08 Apr 2019
Correlation Congruence for Knowledge Distillation Baoyun Peng Xiao Jin Jiaheng Liu Shunfeng Zhou Yichao Wu Yu Liu Dongsheng Li Zhaoning Zhang 100 515 0 03 Apr 2019
Convergence rates for the stochastic gradient descent method for non-convex objective functions Benjamin J. Fehrman Benjamin Gess Arnulf Jentzen 98 101 0 02 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural Networks Gilad Yehudai Ohad Shamir MLT 125 182 0 01 Apr 2019
On the Stability and Generalization of Learning with Kernel Activation Functions M. Cirillo Simone Scardapane S. Van Vaerenbergh A. Uncini 20 0 0 28 Mar 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks Mingchen Li Mahdi Soltanolkotabi Samet Oymak NoLa 140 356 0 27 Mar 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation Trevor Hastie Andrea Montanari Saharon Rosset Robert Tibshirani 302 747 0 19 Mar 2019
Stabilize Deep ResNet with A Sharp Scaling Factor $τ$ Huishuai Zhang Da Yu Mingyang Yi Wei Chen Tie-Yan Liu 57 9 0 17 Mar 2019
Theory III: Dynamics and Generalization in Deep Networks Andrzej Banburski Q. Liao Alycia Lee Lorenzo Rosasco Fernanda De La Torre Jack Hidary T. Poggio AI4CE 74 3 0 12 Mar 2019
Mean Field Analysis of Deep Neural Networks Justin A. Sirignano K. Spiliopoulos 109 82 0 11 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks E. Weinan Chao Ma Qingcan Wang UQCV 103 61 0 06 Mar 2019
Why Learning of Large-Scale Neural Networks Behaves Like Convex Optimization Hui Jiang 28 8 0 06 Mar 2019
Implicit Regularization in Over-parameterized Neural Networks M. Kubo Ryotaro Banno Hidetaka Manabe Masataka Minoji 88 23 0 05 Mar 2019
Stabilizing the Lottery Ticket Hypothesis Jonathan Frankle Gintare Karolina Dziugaite Daniel M. Roy Michael Carbin 88 103 0 05 Mar 2019
LipschitzLR: Using theoretically computed adaptive learning rates for fast convergence Rahul Yedida Snehanshu Saha Tejas Prashanth ODL 53 12 0 20 Feb 2019
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network Xiaoxia Wu S. Du Rachel A. Ward 103 66 0 19 Feb 2019
Mean-field theory of two-layers neural networks: dimension-free bounds and kernel limit Song Mei Theodor Misiakiewicz Andrea Montanari MLT 90 280 0 16 Feb 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation Greg Yang 209 289 0 13 Feb 2019
Identity Crisis: Memorization and Generalization under Extreme Overparameterization Chiyuan Zhang Samy Bengio Moritz Hardt Michael C. Mozer Y. Singer 60 90 0 13 Feb 2019
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks Samet Oymak Mahdi Soltanolkotabi 79 323 0 12 Feb 2019
Understanding over-parameterized deep networks by geometrization Xiao Dong Ling Zhou GNN AI4CE 45 7 0 11 Feb 2019
Mean Field Limit of the Learning Dynamics of Multilayer Neural Networks Phan-Minh Nguyen AI4CE 82 72 0 07 Feb 2019
Are All Layers Created Equal? Chiyuan Zhang Samy Bengio Y. Singer 111 140 0 06 Feb 2019
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks Yuan Cao Quanquan Gu ODL MLT AI4CE 156 158 0 04 Feb 2019
Stiffness: A New Perspective on Generalization in Neural Networks Stanislav Fort Pawel Krzysztof Nowak Stanislaw Jastrzebski S. Narayanan 152 94 0 28 Jan 2019
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs D. Gilboa B. Chang Minmin Chen Greg Yang S. Schoenholz Ed H. Chi Jeffrey Pennington 86 42 0 25 Jan 2019