v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown

Title
Weighted Neural Tangent Kernel: A Generalized and Improved Network-Induced Kernel Lei Tan Shutong Wu Xiaolin Huang 28 2 0 22 Mar 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation Q. Du Yiqi Gu Haizhao Yang Chao Zhou 66 20 0 21 Mar 2021
The Low-Rank Simplicity Bias in Deep Networks Minyoung Huh H. Mobahi Richard Y. Zhang Brian Cheung Pulkit Agrawal Phillip Isola 117 116 0 18 Mar 2021
Sample Complexity of Offline Reinforcement Learning with Deep ReLU Networks Thanh Nguyen-Tang Sunil R. Gupta Hung The Tran Svetha Venkatesh OffRL 137 7 0 11 Mar 2021
On the Generalization Power of Overfitted Two-Layer Neural Tangent Kernel Models Peizhong Ju Xiaojun Lin Ness B. Shroff MLT 71 10 0 09 Mar 2021
Asymptotics of Ridge Regression in Convolutional Models Mojtaba Sahraee-Ardakan Tung Mai Anup B. Rao Ryan Rossi S. Rangan A. Fletcher MLT 45 2 0 08 Mar 2021
Unintended Effects on Adaptive Learning Rate for Training Neural Network with Output Scale Change Ryuichi Kanoh M. Sugiyama 25 0 0 05 Mar 2021
Generalization Bounds for Sparse Random Feature Expansions Abolfazl Hashemi Hayden Schaeffer Robert Shi Ufuk Topcu Giang Tran Rachel A. Ward MLT 147 42 0 04 Mar 2021
Lost in Pruning: The Effects of Pruning Neural Networks beyond Test Accuracy Lucas Liebenwein Cenk Baykal Brandon Carter David K Gifford Daniela Rus AAML 84 74 0 04 Mar 2021
Shift Invariance Can Reduce Adversarial Robustness Songwei Ge Vasu Singla Ronen Basri David Jacobs AAML OOD 85 28 0 03 Mar 2021
Self-Regularity of Non-Negative Output Weights for Overparameterized Two-Layer Neural Networks D. Gamarnik Eren C. Kizildaug Ilias Zadik 95 1 0 02 Mar 2021
Sample Complexity and Overparameterization Bounds for Temporal Difference Learning with Neural Network Approximation Semih Cayci Siddhartha Satpathi Niao He F. I. R. Srikant 89 9 0 02 Mar 2021
Quantifying the Benefit of Using Differentiable Learning over Tangent Kernels Eran Malach Pritish Kamath Emmanuel Abbe Nathan Srebro 88 39 0 01 Mar 2021
Experiments with Rich Regime Training for Deep Learning Xinyan Li A. Banerjee 73 2 0 26 Feb 2021
Learning with invariances in random features and kernel models Song Mei Theodor Misiakiewicz Andrea Montanari OOD 107 91 0 25 Feb 2021
Batched Neural Bandits Quanquan Gu Amin Karbasi Khashayar Khosravi Vahab Mirrokni Dongruo Zhou BDL OffRL 62 25 0 25 Feb 2021
On the Validity of Modeling SGD with Stochastic Differential Equations (SDEs) Zhiyuan Li Sadhika Malladi Sanjeev Arora 104 80 0 24 Feb 2021
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases Arnulf Jentzen T. Kröger ODL 73 7 0 23 Feb 2021
Classifying high-dimensional Gaussian mixtures: Where kernel methods fail and neural networks succeed Maria Refinetti Sebastian Goldt Florent Krzakala Lenka Zdeborová 92 74 0 23 Feb 2021
GIST: Distributed Training for Large-Scale Graph Convolutional Networks Cameron R. Wolfe Jingkang Yang Arindam Chowdhury Chen Dun Artun Bayer Santiago Segarra Anastasios Kyrillidis BDL GNN LRM 120 9 0 20 Feb 2021
A proof of convergence for gradient descent in the training of artificial neural networks for constant target functions Patrick Cheridito Arnulf Jentzen Adrian Riekert Florian Rossmannek 69 25 0 19 Feb 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Shahar Azulay E. Moroshko Mor Shpigel Nacson Blake E. Woodworth Nathan Srebro Amir Globerson Daniel Soudry AI4CE 89 74 0 19 Feb 2021
A Mathematical Principle of Deep Learning: Learn the Geodesic Curve in the Wasserstein Space Kuo Gai Shihua Zhang 103 8 0 18 Feb 2021
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization Xiaoxiao Li Meirui Jiang Xiaofei Zhang Michael Kamp Qi Dou OOD FedML 296 834 0 15 Feb 2021
WGAN with an Infinitely Wide Generator Has No Spurious Stationary Points Albert No Taeho Yoon Sehyun Kwon Ernest K. Ryu GAN 56 2 0 15 Feb 2021
Weight Rescaling: Effective and Robust Regularization for Deep Neural Networks with Batch Normalization Ziquan Liu Yufei Cui Jia Wan Yushun Mao Antoni B. Chan 116 2 0 06 Feb 2021
A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network Mo Zhou Rong Ge Chi Jin 145 46 0 04 Feb 2021
Information-Theoretic Generalization Bounds for Stochastic Gradient Descent Gergely Neu Gintare Karolina Dziugaite Mahdi Haghifam Daniel M. Roy 128 90 0 01 Feb 2021
Neural Networks with Complex-Valued Weights Have No Spurious Local Minima Xingtu Liu MLT 48 0 0 31 Jan 2021
Generalization error of random features and kernel methods: hypercontractivity and kernel matrix concentration Song Mei Theodor Misiakiewicz Andrea Montanari 105 113 0 26 Jan 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 126 49 0 24 Jan 2021
Implicit Bias of Linear RNNs M Motavali Emami Mojtaba Sahraee-Ardakan Parthe Pandit S. Rangan A. Fletcher 44 11 0 19 Jan 2021
Learning with Gradient Descent and Weakly Convex Losses Dominic Richards Michael G. Rabbat MLT 71 15 0 13 Jan 2021
Reproducing Activation Function for Deep Learning Senwei Liang Liyao Lyu Chunmei Wang Haizhao Yang 80 23 0 13 Jan 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks Asaf Noy Yi Tian Xu Y. Aflalo Lihi Zelnik-Manor Rong Jin 80 3 0 12 Jan 2021
Towards Understanding Learning in Neural Networks with Linear Teachers Roei Sarussi Alon Brutzkus Amir Globerson FedML MLT 122 21 0 07 Jan 2021
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise Spencer Frei Yuan Cao Quanquan Gu FedML MLT 168 21 0 04 Jan 2021
Particle Dual Averaging: Optimization of Mean Field Neural Networks with Global Convergence Rate Analysis Atsushi Nitanda Denny Wu Taiji Suzuki 86 29 0 31 Dec 2020
Perspective: A Phase Diagram for Deep Learning unifying Jamming, Feature Learning and Lazy Training Mario Geiger Leonardo Petrini Matthieu Wyart DRL 72 11 0 30 Dec 2020
Mathematical Models of Overparameterized Neural Networks Cong Fang Hanze Dong Tong Zhang 181 23 0 27 Dec 2020
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks Quynh N. Nguyen Marco Mondelli Guido Montúfar 87 83 0 21 Dec 2020
Recent advances in deep learning theory Fengxiang He Dacheng Tao AI4CE 132 51 0 20 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning Zeyuan Allen-Zhu Yuanzhi Li FedML 189 377 0 17 Dec 2020
Strong overall error analysis for the training of artificial neural networks via random initializations Arnulf Jentzen Adrian Riekert 44 3 0 15 Dec 2020
Approximation of BV functions by neural networks: A regularity theory approach B. Avelin Vesa Julin 19 3 0 15 Dec 2020
Notes on Deep Learning Theory Eugene Golikov VLM AI4CE 25 2 0 10 Dec 2020
On the emergence of simplex symmetry in the final and penultimate layers of neural network classifiers E. Weinan Stephan Wojtowytsch 92 45 0 10 Dec 2020
Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics D. Kunin Javier Sagastuy-Breña Surya Ganguli Daniel L. K. Yamins Hidenori Tanaka 167 80 0 08 Dec 2020
Benefit of deep learning with non-convex noisy gradient descent: Provable excess risk bound and superiority to kernel methods Taiji Suzuki Shunta Akiyama MLT 65 12 0 06 Dec 2020
Effect of the initial configuration of weights on the training and function of artificial neural networks Ricardo J. Jesus Mário Antunes R. A. D. Costa S. Dorogovtsev J. F. F. Mendes R. Aguiar 74 15 0 04 Dec 2020