Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.12065
Cited By
On the Convergence Rate of Training Recurrent Neural Networks
29 October 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Convergence Rate of Training Recurrent Neural Networks"
28 / 128 papers shown
Title
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Zixiang Chen
Yuan Cao
Difan Zou
Quanquan Gu
22
122
0
27 Nov 2019
Machine Learning for Prediction with Missing Dynamics
J. Harlim
Shixiao W. Jiang
Senwei Liang
Haizhao Yang
AI4CE
14
60
0
13 Oct 2019
A Constructive Prediction of the Generalization Error Across Scales
Jonathan S. Rosenfeld
Amir Rosenfeld
Yonatan Belinkov
Nir Shavit
36
205
0
27 Sep 2019
Polylogarithmic width suffices for gradient descent to achieve arbitrarily small test error with shallow ReLU networks
Ziwei Ji
Matus Telgarsky
27
177
0
26 Sep 2019
Mildly Overparametrized Neural Nets can Memorize Training Data Efficiently
Rong Ge
Runzhe Wang
Haoyu Zhao
TDI
18
20
0
26 Sep 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
42
51
0
24 Jul 2019
A Fine-Grained Spectral Perspective on Neural Networks
Greg Yang
Hadi Salman
30
110
0
24 Jul 2019
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
6
291
0
10 Jul 2019
An Improved Analysis of Training Over-parameterized Deep Neural Networks
Difan Zou
Quanquan Gu
21
230
0
11 Jun 2019
Quadratic Suffices for Over-parametrization via Matrix Chernoff Bound
Zhao Song
Xin Yang
15
91
0
09 Jun 2019
Learning in Gated Neural Networks
Ashok Vardhan Makkuva
Sewoong Oh
Sreeram Kannan
Pramod Viswanath
11
10
0
06 Jun 2019
The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies
Ronen Basri
David Jacobs
Yoni Kasten
S. Kritchman
8
216
0
02 Jun 2019
Enhancing Adversarial Defense by k-Winners-Take-All
Chang Xiao
Peilin Zhong
Changxi Zheng
AAML
24
97
0
25 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
16
33
0
23 May 2019
Stabilize Deep ResNet with A Sharp Scaling Factor
τ
τ
τ
Huishuai Zhang
Da Yu
Mingyang Yi
Wei Chen
Tie-Yan Liu
32
8
0
17 Mar 2019
Implicit Regularization in Over-parameterized Neural Networks
M. Kubo
Ryotaro Banno
Hidetaka Manabe
Masataka Minoji
19
23
0
05 Mar 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
34
1,077
0
18 Feb 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Greg Yang
11
284
0
13 Feb 2019
Generalization Error Bounds of Gradient Descent for Learning Over-parameterized Deep ReLU Networks
Yuan Cao
Quanquan Gu
ODL
MLT
AI4CE
25
155
0
04 Feb 2019
Can SGD Learn Recurrent Neural Networks with Provable Generalization?
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
LRM
11
57
0
04 Feb 2019
Towards a Theoretical Understanding of Hashing-Based Neural Nets
Yibo Lin
Zhao Song
Lin F. Yang
14
5
0
26 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
33
446
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
32
765
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
42
1,448
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,125
0
09 Nov 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
244
349
0
14 Jun 2018
Spectral Filtering for General Linear Dynamical Systems
Elad Hazan
Holden Lee
Karan Singh
Cyril Zhang
Yi Zhang
45
97
0
12 Feb 2018
Previous
1
2
3