Width Provably Matters in Optimization for Deep Linear Neural Networks

v1v2v3 (latest)

Width Provably Matters in Optimization for Deep Linear Neural Networks

24 January 2019

ArXiv (abs)PDF HTML

Papers citing "Width Provably Matters in Optimization for Deep Linear Neural Networks"

18 / 68 papers shown

Title
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network Jun-Kun Wang Chi-Heng Lin Jacob D. Abernethy 76 24 0 04 Oct 2020
Deep matrix factorizations Pierre De Handschutter Nicolas Gillis Xavier Siebert BDL 128 47 0 01 Oct 2020
Neural Path Features and Neural Path Kernel : Understanding the role of gates in deep learning Chandrashekar Lakshminarayanan Amit Singh AI4CE 64 10 0 11 Jun 2020
Analysis of Knowledge Transfer in Kernel Regime Arman Rahbar Ashkan Panahi Chiranjib Bhattacharyya Devdatt Dubhashi M. Chehreghani 68 3 0 30 Mar 2020
On the Global Convergence of Training Deep Linear ResNets Difan Zou Philip M. Long Quanquan Gu 72 39 0 02 Mar 2020
Revealing the Structure of Deep Neural Networks via Convex Duality Tolga Ergen Mert Pilanci MLT 100 72 0 22 Feb 2020
Deep Gated Networks: A framework to understand training and generalisation in deep learning Chandrashekar Lakshminarayanan Amit Singh AI4CE 41 1 0 10 Feb 2020
Distribution Approximation and Statistical Estimation Guarantees of Generative Adversarial Networks Minshuo Chen Wenjing Liao H. Zha Tuo Zhao 106 17 0 10 Feb 2020
Quasi-Equivalence of Width and Depth of Neural Networks Fenglei Fan Rongjie Lai Ge Wang 69 11 0 06 Feb 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks Wei Hu Lechao Xiao Jeffrey Pennington 77 115 0 16 Jan 2020
Global Convergence of Gradient Descent for Deep Linear Residual Networks Lei Wu Qingcan Wang Chao Ma ODL AI4CE 97 22 0 02 Nov 2019
Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural Networks Yeonjong Shin 61 12 0 14 Oct 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound Zhao Song Xin Yang 75 91 0 09 Jun 2019
Implicit Regularization in Deep Matrix Factorization Sanjeev Arora Nadav Cohen Wei Hu Yuping Luo AI4CE 111 509 0 31 May 2019
On Exact Computation with an Infinitely Wide Neural Net Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruslan Salakhutdinov Ruosong Wang 283 928 0 26 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections E. Weinan Chao Ma Qingcan Wang Lei Wu MLT 108 22 0 10 Apr 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning Kenji Kawaguchi Jiaoyang Huang L. Kaelbling AAML 96 18 0 07 Apr 2019
Elimination of All Bad Local Minima in Deep Learning Kenji Kawaguchi L. Kaelbling 102 45 0 02 Jan 2019