Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.07111
Cited By
Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network
19 February 2019
Xiaoxia Wu
S. Du
Rachel A. Ward
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Global Convergence of Adaptive Gradient Methods for An Over-parameterized Neural Network"
8 / 8 papers shown
Title
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
61
1
0
12 Apr 2024
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
133
448
0
21 Nov 2018
Convergence and Dynamical Behavior of the ADAM Algorithm for Non-Convex Stochastic Optimization
Anas Barakat
Pascal Bianchi
33
75
0
04 Oct 2018
On the Convergence of Adaptive Gradient Methods for Nonconvex Optimization
Dongruo Zhou
Yiqi Tang
Yuan Cao
Ziyan Yang
Quanquan Gu
40
150
0
16 Aug 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
138
652
0
03 Aug 2018
Closing the Generalization Gap of Adaptive Gradient Methods in Training Deep Neural Networks
Jinghui Chen
Dongruo Zhou
Yiqi Tang
Ziyan Yang
Yuan Cao
Quanquan Gu
ODL
63
192
0
18 Jun 2018
WNGrad: Learn the Learning Rate in Gradient Descent
Xiaoxia Wu
Rachel A. Ward
Léon Bottou
41
87
0
07 Mar 2018
ADADELTA: An Adaptive Learning Rate Method
Matthew D. Zeiler
ODL
113
6,619
0
22 Dec 2012
1