v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018

Aarti Singh

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown

Title
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel Perspective Kaixuan Huang Yuqing Wang Molei Tao T. Zhao MLT 62 98 0 14 Feb 2020
Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function Lingkai Kong Molei Tao 57 23 0 14 Feb 2020
Estimating Uncertainty Intervals from Collaborating Networks Tianhui Zhou Yitong Li Yuan Wu David Carlson UQCV 177 17 0 12 Feb 2020
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent David Holzmüller Ingo Steinwart MLT 35 8 0 12 Feb 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss Lénaïc Chizat Francis R. Bach MLT 255 341 0 11 Feb 2020
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks Zixiang Chen Yuan Cao Quanquan Gu Tong Zhang MLT 82 10 0 10 Feb 2020
Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width Yu Bai Ben Krause Huan Wang Caiming Xiong R. Socher 77 22 0 10 Feb 2020
Quasi-Equivalence of Width and Depth of Neural Networks Fenglei Fan Rongjie Lai Ge Wang 72 11 0 06 Feb 2020
A Deep Conditioning Treatment of Neural Networks Naman Agarwal Pranjal Awasthi Satyen Kale AI4CE 119 16 0 04 Feb 2020
Learning from Noisy Similar and Dissimilar Data Soham Dan Han Bao Masashi Sugiyama NoLa 40 7 0 03 Feb 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need Eran Malach Gilad Yehudai Shai Shalev-Shwartz Ohad Shamir 130 277 0 03 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations Roman Vershynin 80 21 0 20 Jan 2020
On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks Michela Paganini Jessica Zosa Forde 84 19 0 14 Jan 2020
Disentangling Trainability and Generalization in Deep Neural Networks Lechao Xiao Jeffrey Pennington S. Schoenholz 82 34 0 30 Dec 2019
On the Principle of Least Symmetry Breaking in Shallow ReLU Models Yossi Arjevani M. Field 65 8 0 26 Dec 2019
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks Aleksandr Shevchenko Marco Mondelli 196 38 0 20 Dec 2019
Second-order Information in First-order Optimization Methods Yuzheng Hu Licong Lin Shange Tang ODL 53 2 0 20 Dec 2019
On the Bias-Variance Tradeoff: Textbooks Need an Update Brady Neal 43 18 0 17 Dec 2019
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation Pan Xu Quanquan Gu 90 68 0 10 Dec 2019
Neural Tangents: Fast and Easy Infinite Neural Networks in Python Roman Novak Lechao Xiao Jiri Hron Jaehoon Lee Alexander A. Alemi Jascha Narain Sohl-Dickstein S. Schoenholz 103 231 0 05 Dec 2019
Stationary Points of Shallow Neural Networks with Quadratic Activation Function D. Gamarnik Eren C. Kizildag Ilias Zadik 43 14 0 03 Dec 2019
Towards Understanding the Spectral Bias of Deep Learning Yuan Cao Zhiying Fang Yue Wu Ding-Xuan Zhou Quanquan Gu 131 220 0 03 Dec 2019
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? Zixiang Chen Yuan Cao Difan Zou Quanquan Gu 77 123 0 27 Nov 2019
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis THANH VAN NGUYEN Raymond K. W. Wong Chinmay Hegde 72 13 0 27 Nov 2019
Implicit Regularization and Convergence for Weight Normalization Xiaoxia Wu Yan Sun Zhaolin Ren Shanshan Wu Zhiyuan Li Suriya Gunasekar Rachel A. Ward Qiang Liu 155 21 0 18 Nov 2019
Convex Formulation of Overparameterized Deep Neural Networks Cong Fang Yihong Gu Weizhong Zhang Tong Zhang 83 28 0 18 Nov 2019
Asymptotics of Reinforcement Learning with Neural Networks Justin A. Sirignano K. Spiliopoulos MLT 98 14 0 13 Nov 2019
Quadratic number of nodes is sufficient to learn a dataset via gradient descent Biswarup Das Eugene Golikov MLT 24 0 0 13 Nov 2019
Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks Yuan Cao Quanquan Gu MLT 78 19 0 12 Nov 2019
Neural Contextual Bandits with UCB-based Exploration Dongruo Zhou Lihong Li Quanquan Gu 135 15 0 11 Nov 2019
Stronger Convergence Results for Deep Residual Networks: Network Width Scales Linearly with Training Data Size Talha Cihad Gulcu 34 0 0 11 Nov 2019
Enhanced Convolutional Neural Tangent Kernels Zhiyuan Li Ruosong Wang Dingli Yu S. Du Wei Hu Ruslan Salakhutdinov Sanjeev Arora 76 133 0 03 Nov 2019
Global Convergence of Gradient Descent for Deep Linear Residual Networks Lei Wu Qingcan Wang Chao Ma ODL AI4CE 97 22 0 02 Nov 2019
Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators Reinhard Heckel Mahdi Soltanolkotabi DiffM 106 83 0 31 Oct 2019
Learning Boolean Circuits with Neural Networks Eran Malach Shai Shalev-Shwartz 62 4 0 25 Oct 2019
Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations Cong Fang Hanze Dong Tong Zhang 58 18 0 25 Oct 2019
Capacity, Bandwidth, and Compositionality in Emergent Language Learning Cinjon Resnick Abhinav Gupta Jakob N. Foerster Andrew M. Dai Kyunghyun Cho 71 50 0 24 Oct 2019
Image recognition from raw labels collected without annotators Fatih Yilmaz Reinhard Heckel NoLa 73 7 0 20 Oct 2019
Self-Adaptive Network Pruning Jinting Chen Zhaocheng Zhu Chengwei Li Yuming Zhao 3DPC 48 22 0 20 Oct 2019
Neural tangent kernels, transportation mappings, and universal approximation Ziwei Ji Matus Telgarsky Ruicheng Xian 84 39 0 15 Oct 2019
The Local Elasticity of Neural Networks Hangfeng He Weijie J. Su 149 46 0 15 Oct 2019
Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural Networks Yeonjong Shin 73 12 0 14 Oct 2019
Nearly Minimal Over-Parametrization of Shallow Neural Networks Armin Eftekhari Chaehwan Song Volkan Cevher 52 1 0 09 Oct 2019
Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks Spencer Frei Yuan Cao Quanquan Gu ODL 82 31 0 07 Oct 2019
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks Sanjeev Arora S. Du Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang Dingli Yu AAML 78 162 0 03 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks Yu Bai Jason D. Lee 69 116 0 03 Oct 2019
$Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network$ Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network Bin Dong Jikai Hou Yiping Lu Zhihua Zhang 77 41 0 02 Oct 2019
The asymptotic spectrum of the Hessian of DNN throughout training Arthur Jacot Franck Gabriel Clément Hongler 138 35 0 01 Oct 2019
On the convergence of gradient descent for two layer neural networks Lei Li MLT 34 0 0 30 Sep 2019
On the Anomalous Generalization of GANs Jinchen Xuan Yunchang Yang Ze Yang Di He Liwei Wang 38 3 0 27 Sep 2019