Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.08888
Cited By
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
21 November 2018
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks"
50 / 116 papers shown
Title
Prediction intervals for Deep Neural Networks
Tullio Mancini
Hector F. Calvo-Pardo
Jose Olmo
UQCV
OOD
23
4
0
08 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant
Chaoyue Liu
Libin Zhu
M. Belkin
21
140
0
02 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
28
86
0
30 Sep 2020
Tensor Programs III: Neural Matrix Laws
Greg Yang
14
43
0
22 Sep 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
34
79
0
17 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
32
158
0
07 Sep 2020
Predicting Training Time Without Training
L. Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
26
24
0
28 Aug 2020
Single-Timescale Actor-Critic Provably Finds Globally Optimal Policy
Zuyue Fu
Zhuoran Yang
Zhaoran Wang
15
42
0
02 Aug 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
J. Lee
Nathan Srebro
Daniel Soudry
35
85
0
13 Jul 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
58
134
0
25 Jun 2020
Directional convergence and alignment in deep learning
Ziwei Ji
Matus Telgarsky
17
163
0
11 Jun 2020
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang
Qi Cai
Zhuoran Yang
Yongxin Chen
Zhaoran Wang
OOD
MLT
105
11
0
08 Jun 2020
Is deeper better? It depends on locality of relevant features
Takashi Mori
Masahito Ueda
OOD
25
4
0
26 May 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
35
147
0
20 May 2020
A Mean-field Analysis of Deep ResNet and Beyond: Towards Provable Optimization Via Overparameterization From Depth
Yiping Lu
Chao Ma
Yulong Lu
Jianfeng Lu
Lexing Ying
MLT
39
78
0
11 Mar 2020
The large learning rate phase of deep learning: the catapult mechanism
Aitor Lewkowycz
Yasaman Bahri
Ethan Dyer
Jascha Narain Sohl-Dickstein
Guy Gur-Ari
ODL
159
234
0
04 Mar 2020
Overall error analysis for the training of deep neural networks via stochastic gradient descent with random initialisation
Arnulf Jentzen
Timo Welti
17
15
0
03 Mar 2020
Warwick Electron Microscopy Datasets
Jeffrey M. Ede
19
14
0
02 Mar 2020
Loss landscapes and optimization in over-parameterized non-linear systems and neural networks
Chaoyue Liu
Libin Zhu
M. Belkin
ODL
9
247
0
29 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning
Zixin Wen
SSL
21
2
0
17 Feb 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
Yi Zhang
Orestis Plevrakis
S. Du
Xingguo Li
Zhao Song
Sanjeev Arora
21
51
0
16 Feb 2020
LaProp: Separating Momentum and Adaptivity in Adam
Liu Ziyin
Zhikang T.Wang
Masahito Ueda
ODL
8
18
0
12 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations
Roman Vershynin
31
21
0
20 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity
Shiyu Liang
Ruoyu Sun
R. Srikant
35
19
0
31 Dec 2019
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
19
168
0
19 Dec 2019
Towards Understanding the Spectral Bias of Deep Learning
Yuan Cao
Zhiying Fang
Yue Wu
Ding-Xuan Zhou
Quanquan Gu
32
214
0
03 Dec 2019
Neural Contextual Bandits with UCB-based Exploration
Dongruo Zhou
Lihong Li
Quanquan Gu
36
15
0
11 Nov 2019
Enhanced Convolutional Neural Tangent Kernels
Zhiyuan Li
Ruosong Wang
Dingli Yu
S. Du
Wei Hu
Ruslan Salakhutdinov
Sanjeev Arora
16
131
0
03 Nov 2019
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Lei Wu
Qingcan Wang
Chao Ma
ODL
AI4CE
28
22
0
02 Nov 2019
Growing axons: greedy learning of neural networks with application to function approximation
Daria Fokina
Ivan Oseledets
21
18
0
28 Oct 2019
The Local Elasticity of Neural Networks
Hangfeng He
Weijie J. Su
40
44
0
15 Oct 2019
Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks
Spencer Frei
Yuan Cao
Quanquan Gu
ODL
9
31
0
07 Oct 2019
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Sanjeev Arora
S. Du
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
Dingli Yu
AAML
16
161
0
03 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Yu Bai
J. Lee
24
116
0
03 Oct 2019
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
28
83
0
18 Sep 2019
Stochastic AUC Maximization with Deep Neural Networks
Mingrui Liu
Zhuoning Yuan
Yiming Ying
Tianbao Yang
9
103
0
28 Aug 2019
Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization
T. Poggio
Andrzej Banburski
Q. Liao
ODL
31
161
0
25 Aug 2019
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
57
626
0
14 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
Benign Overfitting in Linear Regression
Peter L. Bartlett
Philip M. Long
Gábor Lugosi
Alexander Tsigler
MLT
8
762
0
26 Jun 2019
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu
Qi Cai
Zhuoran Yang
Zhaoran Wang
24
108
0
25 Jun 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
52
322
0
13 Jun 2019
Kernel and Rich Regimes in Overparametrized Models
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
J. Lee
Daniel Soudry
Nathan Srebro
24
352
0
13 Jun 2019
Generalization bounds for deep convolutional neural networks
Philip M. Long
Hanie Sedghi
MLT
42
89
0
29 May 2019
Norm-based generalisation bounds for multi-class convolutional neural networks
Antoine Ledent
Waleed Mustafa
Yunwen Lei
Marius Kloft
18
5
0
29 May 2019
Gram-Gauss-Newton Method: Learning Overparameterized Neural Networks for Regression Problems
Tianle Cai
Ruiqi Gao
Jikai Hou
Siyu Chen
Dong Wang
Di He
Zhihua Zhang
Liwei Wang
ODL
21
57
0
28 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
16
33
0
23 May 2019
A type of generalization error induced by initialization in deep neural networks
Yaoyu Zhang
Zhi-Qin John Xu
Tao Luo
Zheng Ma
9
49
0
19 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Colin Wei
Tengyu Ma
20
109
0
09 May 2019
Previous
1
2
3
Next