A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 354 papers shown

Title
Are wider nets better given the same number of parameters? A. Golubeva Behnam Neyshabur Guy Gur-Ari 27 44 0 27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks Zhiqi Bu Shiyun Xu Kan Chen 33 17 0 25 Oct 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime Andrea Agazzi Jianfeng Lu 13 15 0 22 Oct 2020
Deep Learning is Singular, and That's Good Daniel Murfet Susan Wei Biwei Huang Hui Li Jesse Gell-Redman T. Quella UQCV 24 26 0 22 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks Chulhee Yun Shankar Krishnan H. Mobahi MLT 18 80 0 06 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant Chaoyue Liu Libin Zhu M. Belkin 21 140 0 02 Oct 2020
Neural Thompson Sampling Weitong Zhang Dongruo Zhou Lihong Li Quanquan Gu 34 115 0 02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes A. Bietti Francis R. Bach 30 86 0 30 Sep 2020
Tensor Programs III: Neural Matrix Laws Greg Yang 14 44 0 22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS Lin Chen Sheng Xu 32 93 0 22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks J. Lee Ruoqi Shen Zhao Song Mengdi Wang Zheng Yu 21 42 0 21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training Tianle Cai Shengjie Luo Keyulu Xu Di He Tie-Yan Liu Liwei Wang GNN 32 159 0 07 Sep 2020
Predicting Training Time Without Training L. Zancato Alessandro Achille Avinash Ravichandran Rahul Bhotika Stefano Soatto 26 24 0 28 Aug 2020
Deep Networks and the Multiple Manifold Problem Sam Buchanan D. Gilboa John N. Wright 166 39 0 25 Aug 2020
Multiple Descent: Design Your Own Generalization Curve Lin Chen Yifei Min M. Belkin Amin Karbasi DRL 33 61 0 03 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy Zuyue Fu Zhuoran Yang Zhaoran Wang 18 42 0 02 Aug 2020
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training Andrea Montanari Yiqiao Zhong 49 95 0 25 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth J. Lee Nathan Srebro Daniel Soudry 35 85 0 13 Jul 2020
Weak error analysis for stochastic gradient descent optimization algorithms A. Bercher Lukas Gonon Arnulf Jentzen Diyora Salimova 26 4 0 03 Jul 2020
On the Similarity between the Laplace and Neural Tangent Kernels Amnon Geifman A. Yadav Yoni Kasten Meirav Galun David Jacobs Ronen Basri 21 89 0 03 Jul 2020
Provably Efficient Neural Estimation of Structural Equation Model: An Adversarial Approach Luofeng Liao You-Lin Chen Zhuoran Yang Bo Dai Zhaoran Wang Mladen Kolar 30 33 0 02 Jul 2020
Associative Memory in Iterated Overparameterized Sigmoid Autoencoders Yibo Jiang Cengiz Pehlevan 19 13 0 30 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture Greg Yang 58 135 0 25 Jun 2020
Logarithmic Pruning is All You Need Laurent Orseau Marcus Hutter Omar Rivasplata 28 88 0 22 Jun 2020
DO-Conv: Depthwise Over-parameterized Convolutional Layer Jinming Cao Yangyan Li Mingchao Sun Ying-Cong Chen Dani Lischinski Daniel Cohen-Or Baoquan Chen Changhe Tu OOD 33 166 0 22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time Jan van den Brand Binghui Peng Zhao Song Omri Weinstein ODL 29 82 0 20 Jun 2020
Exploring Weight Importance and Hessian Bias in Model Pruning Mingchen Li Yahya Sattar Christos Thrampoulidis Samet Oymak 28 3 0 19 Jun 2020
An Online Method for A Class of Distributionally Robust Optimization with Non-Convex Objectives Qi Qi Zhishuai Guo Yi Tian Xu Rong Jin Tianbao Yang 33 44 0 17 Jun 2020
Improving Graph Neural Network Expressivity via Subgraph Isomorphism Counting Giorgos Bouritsas Fabrizio Frasca S. Zafeiriou M. Bronstein 58 424 0 16 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks Patrick Cheridito Arnulf Jentzen Florian Rossmannek 14 37 0 12 Jun 2020
Directional convergence and alignment in deep learning Ziwei Ji Matus Telgarsky 20 163 0 11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory Yufeng Zhang Qi Cai Zhuoran Yang Yongxin Chen Zhaoran Wang OOD MLT 135 11 0 08 Jun 2020
Is deeper better? It depends on locality of relevant features Takashi Mori Masahito Ueda OOD 25 4 0 26 May 2020
Spectra of the Conjugate Kernel and Neural Tangent Kernel for linear-width neural networks Z. Fan Zhichao Wang 44 71 0 25 May 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning Zeyuan Allen-Zhu Yuanzhi Li MLT AAML 37 147 0 20 May 2020
Orthogonal Over-Parameterized Training Weiyang Liu Rongmei Lin Zhen Liu James M. Rehg Liam Paull Li Xiong Le Song Adrian Weller 32 41 0 09 Apr 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth Yiping Lu Chao Ma Yulong Lu Jianfeng Lu Lexing Ying MLT 39 78 0 11 Mar 2020
Frequency Bias in Neural Networks for Input of Non-Uniform Density Ronen Basri Meirav Galun Amnon Geifman David Jacobs Yoni Kasten S. Kritchman 39 183 0 10 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism Aitor Lewkowycz Yasaman Bahri Ethan Dyer Jascha Narain Sohl-Dickstein Guy Gur-Ari ODL 159 235 0 04 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks Chaoyue Liu Libin Zhu M. Belkin ODL 17 248 0 29 Feb 2020
Neural Networks are Convex Regularizers: Exact Polynomial-time Convex Optimization Formulations for Two-layer Networks Mert Pilanci Tolga Ergen 26 116 0 24 Feb 2020
Generalisation error in learning with random features and the hidden manifold model Federica Gerace Bruno Loureiro Florent Krzakala M. Mézard Lenka Zdeborová 25 166 0 21 Feb 2020
Learning Parities with Neural Networks Amit Daniely Eran Malach 24 76 0 18 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning Zixin Wen SSL 21 2 0 17 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality Yi Zhang Orestis Plevrakis S. Du Xingguo Li Zhao Song Sanjeev Arora 29 51 0 16 Feb 2020
Distribution Approximation and Statistical Estimation Guarantees of Generative Adversarial Networks Minshuo Chen Wenjing Liao H. Zha Tuo Zhao 26 15 0 10 Feb 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need Eran Malach Gilad Yehudai Shai Shalev-Shwartz Ohad Shamir 64 271 0 03 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations Roman Vershynin 31 21 0 20 Jan 2020
Distributionally Robust Deep Learning using Hardness Weighted Sampling Lucas Fidon Michael Aertsen Thomas Deprest Doaa Emam Frédéric Guffens ... Andrew Melbourne Sébastien Ourselin Jan Deprest Georg Langs Tom Kamiel Magda Vercauteren OOD 22 10 0 08 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 35 19 0 31 Dec 2019