Gradient Descent Maximizes the Margin of Homogeneous Neural Networks

13 June 2019

Papers citing "Gradient Descent Maximizes the Margin of Homogeneous Neural Networks"

46 / 246 papers shown

Title
Provable Generalization of SGD-trained Neural Networks of Any Width in the Presence of Adversarial Label Noise Spencer Frei Yuan Cao Quanquan Gu FedML MLT 64 19 0 04 Jan 2021
Explicit regularization and implicit bias in deep network classifiers trained with the square loss T. Poggio Q. Liao 11 41 0 31 Dec 2020
Towards Resolving the Implicit Bias of Gradient Descent for Matrix Factorization: Greedy Low-Rank Learning Zhiyuan Li Yuping Luo Kaifeng Lyu 20 120 0 17 Dec 2020
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks Bohan Wang Qi Meng Wei Chen Tie-Yan Liu 27 33 0 11 Dec 2020
Implicit Regularization in ReLU Networks with the Square Loss Gal Vardi Ohad Shamir 11 48 0 09 Dec 2020
Implicit bias of deep linear networks in the large learning rate phase Wei Huang Weitao Du R. Xu Chunrui Liu 24 2 0 25 Nov 2020
Implicit bias of any algorithm: bounding bias via margin Elvis Dohmatob 9 0 0 12 Nov 2020
Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets Depen Morwani H. G. Ramaswamy 9 3 0 24 Oct 2020
Train simultaneously, generalize better: Stability of gradient-based minimax learners Farzan Farnia Asuman Ozdaglar 31 47 0 23 Oct 2020
Precise Statistical Analysis of Classification Accuracies for Adversarial Training Adel Javanmard Mahdi Soltanolkotabi AAML 26 62 0 21 Oct 2020
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent William Merrill Vivek Ramanujan Yoav Goldberg Roy Schwartz Noah A. Smith AI4CE 11 36 0 19 Oct 2020
AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients Juntang Zhuang Tommy M. Tang Yifan Ding S. Tatikonda Nicha Dvornek X. Papademetris James S. Duncan ODL 14 501 0 15 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks Chulhee Yun Shankar Krishnan H. Mobahi MLT 13 80 0 06 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network Jun-Kun Wang Chi-Heng Lin Jacob D. Abernethy 8 23 0 04 Oct 2020
Understanding Implicit Regularization in Over-Parameterized Single Index Model Jianqing Fan Zhuoran Yang Mengxin Yu 24 16 0 16 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy E. Moroshko Suriya Gunasekar Blake E. Woodworth J. Lee Nathan Srebro Daniel Soudry 35 85 0 13 Jul 2020
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network Tianyang Hu Wenjia Wang Cong Lin Guang Cheng 14 51 0 06 Jul 2020
The Surprising Simplicity of the Early-Time Learning Dynamics of Neural Networks Wei Hu Lechao Xiao Ben Adlam Jeffrey Pennington 23 62 0 25 Jun 2020
Implicitly Maximizing Margins with the Hinge Loss Justin Lizama 13 1 0 25 Jun 2020
Gradient descent follows the regularization path for general losses Ziwei Ji Miroslav Dudík Robert Schapire Matus Telgarsky AI4CE FaML 6 60 0 19 Jun 2020
When Does Preconditioning Help or Hurt Generalization? S. Amari Jimmy Ba Roger C. Grosse Xuechen Li Atsushi Nitanda Taiji Suzuki Denny Wu Ji Xu 36 32 0 18 Jun 2020
Directional Pruning of Deep Neural Networks Shih-Kang Chao Zhanyu Wang Yue Xing Guang Cheng ODL 15 33 0 16 Jun 2020
Shape Matters: Understanding the Implicit Bias of the Noise Covariance Jeff Z. HaoChen Colin Wei J. Lee Tengyu Ma 29 93 0 15 Jun 2020
Generalization by Recognizing Confusion Daniel Chiu Franklyn Wang S. Kominers NoLa 11 0 0 13 Jun 2020
Directional convergence and alignment in deep learning Ziwei Ji Matus Telgarsky 12 162 0 11 Jun 2020
Structure preserving deep learning E. Celledoni Matthias Joachim Ehrhardt Christian Etmann R. McLachlan B. Owren Carola-Bibiane Schönlieb Ferdia Sherry AI4CE 15 44 0 05 Jun 2020
Is deeper better? It depends on locality of relevant features Takashi Mori Masahito Ueda OOD 19 4 0 26 May 2020
Implicit Regularization in Deep Learning May Not Be Explainable by Norms Noam Razin Nadav Cohen 24 155 0 13 May 2020
A function space analysis of finite neural networks with insights from sampling theory Raja Giryes 19 6 0 15 Apr 2020
Mirrorless Mirror Descent: A Natural Derivation of Mirror Descent Suriya Gunasekar Blake E. Woodworth Nathan Srebro MDE 19 28 0 02 Apr 2020
An Optimization and Generalization Analysis for Max-Pooling Networks Alon Brutzkus Amir Globerson MLT AI4CE 16 4 0 22 Feb 2020
On the Decision Boundaries of Neural Networks: A Tropical Geometry Perspective Motasem Alfarra Adel Bibi Hasan Hammoud M. Gaafar Guohao Li 11 26 0 20 Feb 2020
Unique Properties of Flat Minima in Deep Networks Rotem Mulayoff T. Michaeli ODL 19 4 0 11 Feb 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss Lénaïc Chizat Francis R. Bach MLT 21 327 0 11 Feb 2020
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks Zixiang Chen Yuan Cao Quanquan Gu Tong Zhang MLT 27 10 0 10 Feb 2020
Reward Tweaking: Maximizing the Total Reward While Planning for Short Horizons Chen Tessler Shie Mannor 14 2 0 09 Feb 2020
Sharp Rate of Convergence for Deep Neural Network Classifiers under the Teacher-Student Setting Tianyang Hu Zuofeng Shang Guang Cheng 27 19 0 19 Jan 2020
Double descent in the condition number T. Poggio Gil Kur Andy Banburski 19 27 0 12 Dec 2019
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks? Zixiang Chen Yuan Cao Difan Zou Quanquan Gu 14 122 0 27 Nov 2019
Improved Sample Complexities for Deep Networks and Robust Classification via an All-Layer Margin Colin Wei Tengyu Ma AAML OOD 36 85 0 09 Oct 2019
A Function Space View of Bounded Norm Infinite Width ReLU Nets: The Multivariate Case Greg Ongie Rebecca Willett Daniel Soudry Nathan Srebro 13 160 0 03 Oct 2019
Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization T. Poggio Andrzej Banburski Q. Liao ODL 29 161 0 25 Aug 2019
Interpolated Adversarial Training: Achieving Robust Neural Networks without Sacrificing Too Much Accuracy Alex Lamb Vikas Verma Kenji Kawaguchi Alexander Matyasko Savya Khosla Arno Solin Yoshua Bengio AAML 30 98 0 16 Jun 2019
Kernel and Rich Regimes in Overparametrized Models Blake E. Woodworth Suriya Gunasekar Pedro H. P. Savarese E. Moroshko Itay Golan J. Lee Daniel Soudry Nathan Srebro 21 352 0 13 Jun 2019
Theory III: Dynamics and Generalization in Deep Networks Andrzej Banburski Q. Liao Brando Miranda Lorenzo Rosasco Fernanda De La Torre Jack Hidary T. Poggio AI4CE 27 3 0 12 Mar 2019
$Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $ \ell^1 $ and $ \ell^0 $ Controls$ Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with $\ell^1$ and $\ell^0$ Controls Jason M. Klusowski Andrew R. Barron 130 142 0 26 Jul 2016