ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
A Theory-Driven Self-Labeling Refinement Method for Contrastive
  Representation Learning
A Theory-Driven Self-Labeling Refinement Method for Contrastive Representation Learning
Pan Zhou
Caiming Xiong
Xiaotong Yuan
Guosheng Lin
SSL
61
12
0
28 Jun 2021
Connection Sensitivity Matters for Training-free DARTS: From
  Architecture-Level Scoring to Operation-Level Sensitivity Analysis
Connection Sensitivity Matters for Training-free DARTS: From Architecture-Level Scoring to Operation-Level Sensitivity Analysis
Miao Zhang
Wei Huang
Li Wang
70
1
0
22 Jun 2021
Locality defeats the curse of dimensionality in convolutional
  teacher-student scenarios
Locality defeats the curse of dimensionality in convolutional teacher-student scenarios
Alessandro Favero
Francesco Cagnetta
Matthieu Wyart
102
31
0
16 Jun 2021
Scaling Neural Tangent Kernels via Sketching and Random Features
Scaling Neural Tangent Kernels via Sketching and Random Features
A. Zandieh
Insu Han
H. Avron
N. Shoham
Chaewon Kim
Jinwoo Shin
84
32
0
15 Jun 2021
On the Convergence and Calibration of Deep Learning with Differential
  Privacy
On the Convergence and Calibration of Deep Learning with Differential Privacy
Zhiqi Bu
Hua Wang
Zongyu Dai
Qi Long
113
31
0
15 Jun 2021
Solving PDEs on Unknown Manifolds with Machine Learning
Solving PDEs on Unknown Manifolds with Machine Learning
Senwei Liang
Shixiao W. Jiang
J. Harlim
Haizhao Yang
AI4CE
114
16
0
12 Jun 2021
Understanding Deflation Process in Over-parametrized Tensor
  Decomposition
Understanding Deflation Process in Over-parametrized Tensor Decomposition
Rong Ge
Y. Ren
Xiang Wang
Mo Zhou
80
19
0
11 Jun 2021
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks
  in Teacher-Student Setting
On Learnability via Gradient Method for Two-Layer ReLU Neural Networks in Teacher-Student Setting
Shunta Akiyama
Taiji Suzuki
MLT
116
13
0
11 Jun 2021
Neural Optimization Kernel: Towards Robust Deep Learning
Neural Optimization Kernel: Towards Robust Deep Learning
Yueming Lyu
Ivor Tsang
56
1
0
11 Jun 2021
Convergence and Alignment of Gradient Descent with Random
  Backpropagation Weights
Convergence and Alignment of Gradient Descent with Random Backpropagation Weights
Ganlin Song
Ruitu Xu
John D. Lafferty
ODL
109
21
0
10 Jun 2021
Early-stopped neural networks are consistent
Early-stopped neural networks are consistent
Ziwei Ji
Justin D. Li
Matus Telgarsky
85
37
0
10 Jun 2021
Initialization Matters: Regularizing Manifold-informed Initialization
  for Neural Recommendation Systems
Initialization Matters: Regularizing Manifold-informed Initialization for Neural Recommendation Systems
Yinan Zhang
Boyang Albert Li
Yong Liu
Hao Wang
Chunyan Miao
39
10
0
09 Jun 2021
Fractal Structure and Generalization Properties of Stochastic
  Optimization Algorithms
Fractal Structure and Generalization Properties of Stochastic Optimization Algorithms
A. Camuto
George Deligiannidis
Murat A. Erdogdu
Mert Gurbuzbalaban
Umut cSimcsekli
Lingjiong Zhu
80
29
0
09 Jun 2021
Submodular + Concave
Submodular + Concave
Siddharth Mitra
Moran Feldman
Amin Karbasi
43
19
0
09 Jun 2021
What training reveals about neural network complexity
What training reveals about neural network complexity
Andreas Loukas
Marinos Poiitis
Stefanie Jegelka
70
11
0
08 Jun 2021
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block
  Inversion
TENGraD: Time-Efficient Natural Gradient Descent with Exact Fisher-Block Inversion
Saeed Soori
Bugra Can
Baourun Mu
Mert Gurbuzbalaban
M. Dehnavi
98
10
0
07 Jun 2021
Neural Tangent Kernel Maximum Mean Discrepancy
Neural Tangent Kernel Maximum Mean Discrepancy
Xiuyuan Cheng
Yao Xie
60
21
0
06 Jun 2021
Learning and Generalization in RNNs
Learning and Generalization in RNNs
A. Panigrahi
Navin Goyal
63
3
0
31 May 2021
Overparameterization of deep ResNet: zero loss and mean-field analysis
Overparameterization of deep ResNet: zero loss and mean-field analysis
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
ODL
95
25
0
30 May 2021
Fit without fear: remarkable mathematical phenomena of deep learning
  through the prism of interpolation
Fit without fear: remarkable mathematical phenomena of deep learning through the prism of interpolation
M. Belkin
73
186
0
29 May 2021
Practical Convex Formulation of Robust One-hidden-layer Neural Network
  Training
Practical Convex Formulation of Robust One-hidden-layer Neural Network Training
Yatong Bai
Tanmay Gautam
Yujie Gai
Somayeh Sojoudi
AAML
93
3
0
25 May 2021
Geometry of the Loss Landscape in Overparameterized Neural Networks:
  Symmetries and Invariances
Geometry of the Loss Landscape in Overparameterized Neural Networks: Symmetries and Invariances
Berfin cSimcsek
François Ged
Arthur Jacot
Francesco Spadaro
Clément Hongler
W. Gerstner
Johanni Brea
AI4CE
87
102
0
25 May 2021
Properties of the After Kernel
Properties of the After Kernel
Philip M. Long
66
29
0
21 May 2021
Deep Kronecker neural networks: A general framework for neural networks
  with adaptive activation functions
Deep Kronecker neural networks: A general framework for neural networks with adaptive activation functions
Ameya Dilip Jagtap
Yeonjong Shin
Kenji Kawaguchi
George Karniadakis
ODL
111
137
0
20 May 2021
The Dynamics of Gradient Descent for Overparametrized Neural Networks
The Dynamics of Gradient Descent for Overparametrized Neural Networks
Siddhartha Satpathi
R. Srikant
MLTAI4CE
60
14
0
13 May 2021
Convergence and Implicit Bias of Gradient Flow on Overparametrized
  Linear Networks
Convergence and Implicit Bias of Gradient Flow on Overparametrized Linear Networks
Hancheng Min
Salma Tarmoun
René Vidal
Enrique Mallada
MLT
84
5
0
13 May 2021
Why Does Multi-Epoch Training Help?
Why Does Multi-Epoch Training Help?
Yi Tian Xu
Qi Qian
Hao Li
Rong Jin
69
1
0
13 May 2021
Global Convergence of Three-layer Neural Networks in the Mean Field
  Regime
Global Convergence of Three-layer Neural Networks in the Mean Field Regime
H. Pham
Phan-Minh Nguyen
MLTAI4CE
91
19
0
11 May 2021
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning
  Convergence Analysis
FL-NTK: A Neural Tangent Kernel-based Framework for Federated Learning Convergence Analysis
Baihe Huang
Xiaoxiao Li
Zhao Song
Xin Yang
FedML
72
16
0
11 May 2021
Optimization of Graph Neural Networks: Implicit Acceleration by Skip
  Connections and More Depth
Optimization of Graph Neural Networks: Implicit Acceleration by Skip Connections and More Depth
Keyulu Xu
Mozhi Zhang
Stefanie Jegelka
Kenji Kawaguchi
GNN
53
78
0
10 May 2021
Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel
  Training Dynamics
Tensor Programs IIb: Architectural Universality of Neural Tangent Kernel Training Dynamics
Greg Yang
Etai Littwin
80
67
0
08 May 2021
RATT: Leveraging Unlabeled Data to Guarantee Generalization
RATT: Leveraging Unlabeled Data to Guarantee Generalization
Saurabh Garg
Sivaraman Balakrishnan
J. Zico Kolter
Zachary Chase Lipton
92
30
0
01 May 2021
One-pass Stochastic Gradient Descent in Overparametrized Two-layer
  Neural Networks
One-pass Stochastic Gradient Descent in Overparametrized Two-layer Neural Networks
Hanjing Zhu
Hanjing Zhu
MLT
42
3
0
01 May 2021
Generalization Guarantees for Neural Architecture Search with
  Train-Validation Split
Generalization Guarantees for Neural Architecture Search with Train-Validation Split
Samet Oymak
Mingchen Li
Mahdi Soltanolkotabi
AI4CEOOD
77
15
0
29 Apr 2021
PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with
  Many Symbols
PCFGs Can Do Better: Inducing Probabilistic Context-Free Grammars with Many Symbols
Aaron Courville
Yanpeng Zhao
Kewei Tu
67
25
0
28 Apr 2021
Achieving Small Test Error in Mildly Overparameterized Neural Networks
Achieving Small Test Error in Mildly Overparameterized Neural Networks
Shiyu Liang
Ruoyu Sun
R. Srikant
58
3
0
24 Apr 2021
On the validity of kernel approximations for orthogonally-initialized
  neural networks
On the validity of kernel approximations for orthogonally-initialized neural networks
James Martens
53
3
0
13 Apr 2021
A Recipe for Global Convergence Guarantee in Deep Neural Networks
A Recipe for Global Convergence Guarantee in Deep Neural Networks
Kenji Kawaguchi
Qingyun Sun
58
12
0
12 Apr 2021
Understanding Overparameterization in Generative Adversarial Networks
Understanding Overparameterization in Generative Adversarial Networks
Yogesh Balaji
M. Sajedi
Neha Kalibhat
Mucong Ding
Dominik Stöger
Mahdi Soltanolkotabi
Soheil Feizi
AI4CE
92
21
0
12 Apr 2021
Noether: The More Things Change, the More Stay the Same
Noether: The More Things Change, the More Stay the Same
Grzegorz Gluch
R. Urbanke
79
18
0
12 Apr 2021
A Theoretical Analysis of Learning with Noisily Labeled Data
A Theoretical Analysis of Learning with Noisily Labeled Data
Yi Tian Xu
Qi Qian
Hao Li
Rong Jin
NoLa
38
1
0
08 Apr 2021
Training Deep Neural Networks via Branch-and-Bound
Training Deep Neural Networks via Branch-and-Bound
Yuanwei Wu
Ziming Zhang
Guanghui Wang
ODL
57
0
0
05 Apr 2021
Random Features for the Neural Tangent Kernel
Random Features for the Neural Tangent Kernel
Insu Han
H. Avron
N. Shoham
Chaewon Kim
Jinwoo Shin
71
9
0
03 Apr 2021
Learning with Neural Tangent Kernels in Near Input Sparsity Time
Learning with Neural Tangent Kernels in Near Input Sparsity Time
A. Zandieh
66
0
0
01 Apr 2021
A proof of convergence for stochastic gradient descent in the training
  of artificial neural networks with ReLU activation for constant target
  functions
A proof of convergence for stochastic gradient descent in the training of artificial neural networks with ReLU activation for constant target functions
Arnulf Jentzen
Adrian Riekert
MLT
86
13
0
01 Apr 2021
Minimum complexity interpolation in random features models
Minimum complexity interpolation in random features models
Michael Celentano
Theodor Misiakiewicz
Andrea Montanari
59
4
0
30 Mar 2021
One Network Fits All? Modular versus Monolithic Task Formulations in
  Neural Networks
One Network Fits All? Modular versus Monolithic Task Formulations in Neural Networks
Atish Agarwala
Abhimanyu Das
Brendan Juba
Rina Panigrahy
Vatsal Sharan
Xin Wang
Qiuyi Zhang
MoMe
44
11
0
29 Mar 2021
On the Stability of Nonlinear Receding Horizon Control: A Geometric
  Perspective
On the Stability of Nonlinear Receding Horizon Control: A Geometric Perspective
T. Westenbroek
Max Simchowitz
Michael I. Jordan
S. Shankar Sastry
96
9
0
27 Mar 2021
Why Do Local Methods Solve Nonconvex Problems?
Why Do Local Methods Solve Nonconvex Problems?
Tengyu Ma
54
13
0
24 Mar 2021
Policy Information Capacity: Information-Theoretic Measure for Task
  Complexity in Deep Reinforcement Learning
Policy Information Capacity: Information-Theoretic Measure for Task Complexity in Deep Reinforcement Learning
Hiroki Furuta
T. Matsushima
Tadashi Kozuno
Y. Matsuo
Sergey Levine
Ofir Nachum
S. Gu
OffRL
58
14
0
23 Mar 2021
Previous
123...91011...161718
Next