ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03962
  4. Cited By
A Convergence Theory for Deep Learning via Over-Parameterization

A Convergence Theory for Deep Learning via Over-Parameterization

9 November 2018
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
    AI4CE
    ODL
ArXivPDFHTML

Papers citing "A Convergence Theory for Deep Learning via Over-Parameterization"

50 / 354 papers shown
Title
Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Roman Novak
Lechao Xiao
Jiri Hron
Jaehoon Lee
Alexander A. Alemi
Jascha Narain Sohl-Dickstein
S. Schoenholz
38
225
0
05 Dec 2019
Towards Understanding the Spectral Bias of Deep Learning
Towards Understanding the Spectral Bias of Deep Learning
Yuan Cao
Zhiying Fang
Yue Wu
Ding-Xuan Zhou
Quanquan Gu
41
215
0
03 Dec 2019
Adaptive dynamic programming for nonaffine nonlinear optimal control
  problem with state constraints
Adaptive dynamic programming for nonaffine nonlinear optimal control problem with state constraints
Jingliang Duan
Zhengyu Liu
Shengbo Eben Li
Qi Sun
Zhenzhong Jia
B. Cheng
15
64
0
26 Nov 2019
Neural Contextual Bandits with UCB-based Exploration
Neural Contextual Bandits with UCB-based Exploration
Dongruo Zhou
Lihong Li
Quanquan Gu
36
15
0
11 Nov 2019
Enhanced Convolutional Neural Tangent Kernels
Enhanced Convolutional Neural Tangent Kernels
Zhiyuan Li
Ruosong Wang
Dingli Yu
S. Du
Wei Hu
Ruslan Salakhutdinov
Sanjeev Arora
21
131
0
03 Nov 2019
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Lei Wu
Qingcan Wang
Chao Ma
ODL
AI4CE
28
22
0
02 Nov 2019
Online Stochastic Gradient Descent with Arbitrary Initialization Solves
  Non-smooth, Non-convex Phase Retrieval
Online Stochastic Gradient Descent with Arbitrary Initialization Solves Non-smooth, Non-convex Phase Retrieval
Yan Shuo Tan
Roman Vershynin
22
35
0
28 Oct 2019
The Local Elasticity of Neural Networks
The Local Elasticity of Neural Networks
Hangfeng He
Weijie J. Su
40
44
0
15 Oct 2019
Algorithm-Dependent Generalization Bounds for Overparameterized Deep
  Residual Networks
Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks
Spencer Frei
Yuan Cao
Quanquan Gu
ODL
9
31
0
07 Oct 2019
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Sanjeev Arora
S. Du
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
Dingli Yu
AAML
19
161
0
03 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of
  Wide Neural Networks
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Yu Bai
J. Lee
24
116
0
03 Oct 2019
Asymptotics of Wide Networks from Feynman Diagrams
Asymptotics of Wide Networks from Feynman Diagrams
Ethan Dyer
Guy Gur-Ari
29
113
0
25 Sep 2019
Sample Efficient Policy Gradient Methods with Recursive Variance
  Reduction
Sample Efficient Policy Gradient Methods with Recursive Variance Reduction
Pan Xu
F. Gao
Quanquan Gu
31
83
0
18 Sep 2019
Stochastic AUC Maximization with Deep Neural Networks
Stochastic AUC Maximization with Deep Neural Networks
Mingrui Liu
Zhuoning Yuan
Yiming Ying
Tianbao Yang
17
103
0
28 Aug 2019
The generalization error of random features regression: Precise
  asymptotics and double descent curve
The generalization error of random features regression: Precise asymptotics and double descent curve
Song Mei
Andrea Montanari
60
626
0
14 Aug 2019
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Gradient Descent Maximizes the Margin of Homogeneous Neural Networks
Kaifeng Lyu
Jian Li
52
324
0
13 Jun 2019
Kernel and Rich Regimes in Overparametrized Models
Blake E. Woodworth
Suriya Gunasekar
Pedro H. P. Savarese
E. Moroshko
Itay Golan
J. Lee
Daniel Soudry
Nathan Srebro
30
353
0
13 Jun 2019
Associated Learning: Decomposing End-to-end Backpropagation based on
  Auto-encoders and Target Propagation
Associated Learning: Decomposing End-to-end Backpropagation based on Auto-encoders and Target Propagation
Yu-Wei Kao
Hung-Hsuan Chen
BDL
20
5
0
13 Jun 2019
Parameterized Structured Pruning for Deep Neural Networks
Parameterized Structured Pruning for Deep Neural Networks
Günther Schindler
Wolfgang Roth
Franz Pernkopf
Holger Froening
24
6
0
12 Jun 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep
  Neural Networks
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao
Quanquan Gu
MLT
AI4CE
19
383
0
30 May 2019
Generalization bounds for deep convolutional neural networks
Generalization bounds for deep convolutional neural networks
Philip M. Long
Hanie Sedghi
MLT
42
89
0
29 May 2019
Enhancing Adversarial Defense by k-Winners-Take-All
Enhancing Adversarial Defense by k-Winners-Take-All
Chang Xiao
Peilin Zhong
Changxi Zheng
AAML
24
97
0
25 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
24
183
0
24 May 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural
  Networks on Classification Problems
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
16
33
0
23 May 2019
A type of generalization error induced by initialization in deep neural
  networks
A type of generalization error induced by initialization in deep neural networks
Yaoyu Zhang
Zhi-Qin John Xu
Tao Luo
Zheng Ma
9
49
0
19 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz
  Augmentation
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Colin Wei
Tengyu Ma
25
109
0
09 May 2019
Linearized two-layers neural networks in high dimension
Linearized two-layers neural networks in high dimension
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
18
241
0
27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
44
905
0
26 Apr 2019
DSTP-RNN: a dual-stage two-phase attention-based recurrent neural
  networks for long-term and multivariate time series prediction
DSTP-RNN: a dual-stage two-phase attention-based recurrent neural networks for long-term and multivariate time series prediction
Yeqi Liu
Chuanyang Gong
Ling Yang
Yingyi Chen
AI4TS
19
305
0
16 Apr 2019
A Selective Overview of Deep Learning
A Selective Overview of Deep Learning
Jianqing Fan
Cong Ma
Yiqiao Zhong
BDL
VLM
38
136
0
10 Apr 2019
Analysis of the Gradient Descent Algorithm for a Deep Neural Network
  Model with Skip-connections
Analysis of the Gradient Descent Algorithm for a Deep Neural Network Model with Skip-connections
E. Weinan
Chao Ma
Qingcan Wang
Lei Wu
MLT
37
22
0
10 Apr 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model
  in Non-convex Machine Learning
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning
Kenji Kawaguchi
Jiaoyang Huang
L. Kaelbling
AAML
24
18
0
07 Apr 2019
Convergence rates for the stochastic gradient descent method for
  non-convex objective functions
Convergence rates for the stochastic gradient descent method for non-convex objective functions
Benjamin J. Fehrman
Benjamin Gess
Arnulf Jentzen
19
101
0
02 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural
  Networks
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
26
181
0
01 Apr 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
47
351
0
27 Mar 2019
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Surprises in High-Dimensional Ridgeless Least Squares Interpolation
Trevor Hastie
Andrea Montanari
Saharon Rosset
R. Tibshirani
31
728
0
19 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks
A Priori Estimates of the Population Risk for Residual Networks
E. Weinan
Chao Ma
Qingcan Wang
UQCV
37
61
0
06 Mar 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
34
1,077
0
18 Feb 2019
Supervised Deep Neural Networks (DNNs) for Pricing/Calibration of
  Vanilla/Exotic Options Under Various Different Processes
Supervised Deep Neural Networks (DNNs) for Pricing/Calibration of Vanilla/Exotic Options Under Various Different Processes
Ali Hirsa
T. Karatas
Amir Oskoui
16
26
0
15 Feb 2019
Understanding over-parameterized deep networks by geometrization
Understanding over-parameterized deep networks by geometrization
Xiao Dong
Ling Zhou
GNN
AI4CE
21
7
0
11 Feb 2019
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs
D. Gilboa
B. Chang
Minmin Chen
Greg Yang
S. Schoenholz
Ed H. Chi
Jeffrey Pennington
34
40
0
25 Jan 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
55
961
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
21
94
0
24 Jan 2019
Scaling description of generalization with number of parameters in deep
  learning
Scaling description of generalization with number of parameters in deep learning
Mario Geiger
Arthur Jacot
S. Spigler
Franck Gabriel
Levent Sagun
Stéphane dÁscoli
Giulio Biroli
Clément Hongler
M. Wyart
52
195
0
06 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
33
446
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
32
765
0
12 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,125
0
09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
23
191
0
29 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of
  memorization capacity
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
26
117
0
17 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs
Learning Two-layer Neural Networks with Symmetric Inputs
Rong Ge
Rohith Kuditipudi
Zhize Li
Xiang Wang
OOD
MLT
36
57
0
16 Oct 2018
Previous
12345678
Next