Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.12065
Cited By
On the Convergence Rate of Training Recurrent Neural Networks
29 October 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
Re-assign community
ArXiv
PDF
HTML
Papers citing
"On the Convergence Rate of Training Recurrent Neural Networks"
50 / 128 papers shown
Title
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao Song
Atri Rudra
Christopher Ré
33
75
0
30 Nov 2021
High Quality Segmentation for Ultra High-resolution Images
Tiancheng Shen
Yuechen Zhang
Lu Qi
Jason Kuen
Xingyu Xie
Jianlong Wu
Zhe-nan Lin
Jiaya Jia
81
43
0
29 Nov 2021
Learning with convolution and pooling operations in kernel methods
Theodor Misiakiewicz
Song Mei
MLT
15
29
0
16 Nov 2021
Theoretical Exploration of Flexible Transmitter Model
Jin-Hui Wu
Shao-Qun Zhang
Yuan Jiang
Zhiping Zhou
44
3
0
11 Nov 2021
Dynamics of Local Elasticity During Training of Neural Nets
Soham Dan
Anirbit Mukherjee
Avirup Das
Phanideep Gampa
25
0
0
01 Nov 2021
Quantifying Epistemic Uncertainty in Deep Learning
Ziyi Huang
H. Lam
Haofeng Zhang
UQCV
BDL
UD
PER
24
12
0
23 Oct 2021
Does Preprocessing Help Training Over-parameterized Neural Networks?
Zhao Song
Shuo Yang
Ruizhe Zhang
38
49
0
09 Oct 2021
On the Provable Generalization of Recurrent Neural Networks
Lifu Wang
Bo Shen
Bo Hu
Xing Cao
39
8
0
29 Sep 2021
Understanding the Generalization of Adam in Learning Neural Networks with Proper Regularization
Difan Zou
Yuan Cao
Yuanzhi Li
Quanquan Gu
MLT
AI4CE
47
39
0
25 Aug 2021
Fast Sketching of Polynomial Kernels of Polynomial Degree
Zhao Song
David P. Woodruff
Zheng Yu
Lichen Zhang
21
40
0
21 Aug 2021
SGD: The Role of Implicit Regularization, Batch-size and Multiple-epochs
Satyen Kale
Ayush Sekhari
Karthik Sridharan
193
29
0
11 Jul 2021
Regularized OFU: an Efficient UCB Estimator forNon-linear Contextual Bandit
Yichi Zhou
Shihong Song
Huishuai Zhang
Jun Zhu
Wei Chen
Tie-Yan Liu
18
0
0
29 Jun 2021
Communication-efficient SGD: From Local SGD to One-Shot Averaging
Artin Spiridonoff
Alexander Olshevsky
I. Paschalidis
FedML
34
20
0
09 Jun 2021
Learning and Generalization in RNNs
A. Panigrahi
Navin Goyal
27
3
0
31 May 2021
Toward Understanding the Feature Learning Process of Self-supervised Contrastive Learning
Zixin Wen
Yuanzhi Li
SSL
MLT
32
131
0
31 May 2021
Near-optimal Offline and Streaming Algorithms for Learning Non-Linear Dynamical Systems
Prateek Jain
S. Kowshik
Dheeraj M. Nagaraj
Praneeth Netrapalli
OffRL
11
23
0
24 May 2021
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis
Baihe Huang
Xiaoxiao Li
Zhao Song
Xin Yang
FedML
31
16
0
11 May 2021
One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks
Hanjing Zhu
Hanjing Zhu
MLT
13
3
0
01 May 2021
Why Do Local Methods Solve Nonconvex Problems?
Tengyu Ma
18
13
0
24 Mar 2021
The Discovery of Dynamics via Linear Multistep Methods and Deep Learning: Error Estimation
Q. Du
Yiqi Gu
Haizhao Yang
Chao Zhou
26
20
0
21 Mar 2021
Learning with invariances in random features and kernel models
Song Mei
Theodor Misiakiewicz
Andrea Montanari
OOD
55
89
0
25 Feb 2021
Recurrent Model Predictive Control
Zhengyu Liu
Jingliang Duan
Wenxuan Wang
Shengbo Eben Li
Yuming Yin
Ziyu Lin
Qi Sun
B. Cheng
18
0
0
23 Feb 2021
Provably Training Overparameterized Neural Network Classifiers with Non-convex Constraints
You-Lin Chen
Zhaoran Wang
Mladen Kolar
16
0
0
30 Dec 2020
Towards Understanding Ensemble, Knowledge Distillation and Self-Distillation in Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
FedML
60
356
0
17 Dec 2020
Recent Theoretical Advances in Non-Convex Optimization
Marina Danilova
Pavel Dvurechensky
Alexander Gasnikov
Eduard A. Gorbunov
Sergey Guminov
Dmitry Kamzolov
Innokentiy Shibaev
33
77
0
11 Dec 2020
Effect of the initial configuration of weights on the training and function of artificial neural networks
Ricardo J. Jesus
Mário Antunes
R. A. D. Costa
S. Dorogovtsev
J. F. F. Mendes
R. Aguiar
12
15
0
04 Dec 2020
Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle
Josh Alman
T. Chu
Gary Miller
Shyam Narayanan
Mark Sellke
Zhao Song
6
1
0
23 Nov 2020
Algorithms and Hardness for Linear Algebra on Geometric Graphs
Josh Alman
T. Chu
Aaron Schild
Zhao Song
57
29
0
04 Nov 2020
Which Minimizer Does My Neural Network Converge To?
Manuel Nonnenmacher
David Reeb
Ingo Steinwart
ODL
8
4
0
04 Nov 2020
Wearing a MASK: Compressed Representations of Variable-Length Sequences Using Recurrent Neural Tangent Kernels
Sina Alemohammad
Hossein Babaei
Randall Balestriero
Matt Y. Cheung
Ahmed Imtiaz Humayun
...
Naiming Liu
Lorenzo Luzi
Jasper Tan
Zichao Wang
Richard G. Baraniuk
9
4
0
27 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
33
17
0
25 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery
Xiaoxiao Li
Yangsibo Huang
Binghui Peng
Zhao Song
Keqin Li
MIACV
30
1
0
22 Oct 2020
RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm
Yun Yue
Ming Li
Venkatesh Saligrama
Ziming Zhang
11
4
0
12 Oct 2020
Tensor Programs III: Neural Matrix Laws
Greg Yang
14
44
0
22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
J. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
21
42
0
21 Sep 2020
A priori guarantees of finite-time convergence for Deep Neural Networks
Anushree Rankawat
M. Rankawat
Harshal B. Oza
14
0
0
16 Sep 2020
Quantitative Propagation of Chaos for SGD in Wide Neural Networks
Valentin De Bortoli
Alain Durmus
Xavier Fontaine
Umut Simsekli
32
25
0
13 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
20
28
0
09 Jul 2020
The Global Landscape of Neural Networks: An Overview
Ruoyu Sun
Dawei Li
Shiyu Liang
Tian Ding
R. Srikant
22
84
0
02 Jul 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
58
135
0
25 Jun 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda
Taiji Suzuki
6
41
0
22 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
29
82
0
20 Jun 2020
The Recurrent Neural Tangent Kernel
Sina Alemohammad
Zichao Wang
Randall Balestriero
Richard Baraniuk
AAML
6
77
0
18 Jun 2020
A General Framework for Analyzing Stochastic Dynamics in Learning Algorithms
Chi-Ning Chou
Juspreet Singh Sandhu
Mien Brabeeba Wang
Tiancheng Yu
11
4
0
11 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
39
147
0
20 May 2020
Provable Training of a ReLU Gate with an Iterative Non-Gradient Algorithm
Sayar Karmakar
Anirbit Mukherjee
14
7
0
08 May 2020
Frequency Bias in Neural Networks for Input of Non-Uniform Density
Ronen Basri
Meirav Galun
Amnon Geifman
David Jacobs
Yoni Kasten
S. Kritchman
42
183
0
10 Mar 2020
RNN-based Online Learning: An Efficient First-Order Optimization Algorithm with a Convergence Guarantee
Nuri Mert Vural
Selim F. Yilmaz
Fatih Ilhan
Suleyman Serdar Kozat
OnRL
19
1
0
07 Mar 2020
Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
Yi Zhang
Orestis Plevrakis
S. Du
Xingguo Li
Zhao Song
Sanjeev Arora
29
51
0
16 Feb 2020
Distributionally Robust Deep Learning using Hardness Weighted Sampling
Lucas Fidon
Michael Aertsen
Thomas Deprest
Doaa Emam
Frédéric Guffens
...
Andrew Melbourne
Sébastien Ourselin
Jan Deprest
Georg Langs
Tom Kamiel Magda Vercauteren
OOD
22
10
0
08 Jan 2020
Previous
1
2
3
Next