ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Why Do Deep Residual Networks Generalize Better than Deep Feedforward
  Networks? -- A Neural Tangent Kernel Perspective
Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? -- A Neural Tangent Kernel Perspective
Kaixuan Huang
Yuqing Wang
Molei Tao
T. Zhao
MLT
62
98
0
14 Feb 2020
Stochasticity of Deterministic Gradient Descent: Large Learning Rate for
  Multiscale Objective Function
Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function
Lingkai Kong
Molei Tao
57
23
0
14 Feb 2020
Estimating Uncertainty Intervals from Collaborating Networks
Estimating Uncertainty Intervals from Collaborating Networks
Tianhui Zhou
Yitong Li
Yuan Wu
David Carlson
UQCV
177
17
0
12 Feb 2020
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
Training Two-Layer ReLU Networks with Gradient Descent is Inconsistent
David Holzmüller
Ingo Steinwart
MLT
35
8
0
12 Feb 2020
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks
  Trained with the Logistic Loss
Implicit Bias of Gradient Descent for Wide Two-layer Neural Networks Trained with the Logistic Loss
Lénaïc Chizat
Francis R. Bach
MLT
255
341
0
11 Feb 2020
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural
  Networks
A Generalized Neural Tangent Kernel Analysis for Two-layer Neural Networks
Zixiang Chen
Yuan Cao
Quanquan Gu
Tong Zhang
MLT
82
10
0
10 Feb 2020
Taylorized Training: Towards Better Approximation of Neural Network
  Training at Finite Width
Taylorized Training: Towards Better Approximation of Neural Network Training at Finite Width
Yu Bai
Ben Krause
Huan Wang
Caiming Xiong
R. Socher
77
22
0
10 Feb 2020
Quasi-Equivalence of Width and Depth of Neural Networks
Quasi-Equivalence of Width and Depth of Neural Networks
Fenglei Fan
Rongjie Lai
Ge Wang
72
11
0
06 Feb 2020
A Deep Conditioning Treatment of Neural Networks
A Deep Conditioning Treatment of Neural Networks
Naman Agarwal
Pranjal Awasthi
Satyen Kale
AI4CE
119
16
0
04 Feb 2020
Learning from Noisy Similar and Dissimilar Data
Learning from Noisy Similar and Dissimilar Data
Soham Dan
Han Bao
Masashi Sugiyama
NoLa
40
7
0
03 Feb 2020
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Proving the Lottery Ticket Hypothesis: Pruning is All You Need
Eran Malach
Gilad Yehudai
Shai Shalev-Shwartz
Ohad Shamir
130
277
0
03 Feb 2020
Memory capacity of neural networks with threshold and ReLU activations
Memory capacity of neural networks with threshold and ReLU activations
Roman Vershynin
80
21
0
20 Jan 2020
On Iterative Neural Network Pruning, Reinitialization, and the
  Similarity of Masks
On Iterative Neural Network Pruning, Reinitialization, and the Similarity of Masks
Michela Paganini
Jessica Zosa Forde
84
19
0
14 Jan 2020
Disentangling Trainability and Generalization in Deep Neural Networks
Disentangling Trainability and Generalization in Deep Neural Networks
Lechao Xiao
Jeffrey Pennington
S. Schoenholz
82
34
0
30 Dec 2019
On the Principle of Least Symmetry Breaking in Shallow ReLU Models
On the Principle of Least Symmetry Breaking in Shallow ReLU Models
Yossi Arjevani
M. Field
65
8
0
26 Dec 2019
Landscape Connectivity and Dropout Stability of SGD Solutions for
  Over-parameterized Neural Networks
Landscape Connectivity and Dropout Stability of SGD Solutions for Over-parameterized Neural Networks
Aleksandr Shevchenko
Marco Mondelli
196
38
0
20 Dec 2019
Second-order Information in First-order Optimization Methods
Second-order Information in First-order Optimization Methods
Yuzheng Hu
Licong Lin
Shange Tang
ODL
53
2
0
20 Dec 2019
On the Bias-Variance Tradeoff: Textbooks Need an Update
On the Bias-Variance Tradeoff: Textbooks Need an Update
Brady Neal
43
18
0
17 Dec 2019
A Finite-Time Analysis of Q-Learning with Neural Network Function
  Approximation
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Pan Xu
Quanquan Gu
90
68
0
10 Dec 2019
Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Neural Tangents: Fast and Easy Infinite Neural Networks in Python
Roman Novak
Lechao Xiao
Jiri Hron
Jaehoon Lee
Alexander A. Alemi
Jascha Narain Sohl-Dickstein
S. Schoenholz
103
231
0
05 Dec 2019
Stationary Points of Shallow Neural Networks with Quadratic Activation
  Function
Stationary Points of Shallow Neural Networks with Quadratic Activation Function
D. Gamarnik
Eren C. Kizildag
Ilias Zadik
43
14
0
03 Dec 2019
Towards Understanding the Spectral Bias of Deep Learning
Towards Understanding the Spectral Bias of Deep Learning
Yuan Cao
Zhiying Fang
Yue Wu
Ding-Xuan Zhou
Quanquan Gu
131
220
0
03 Dec 2019
How Much Over-parameterization Is Sufficient to Learn Deep ReLU
  Networks?
How Much Over-parameterization Is Sufficient to Learn Deep ReLU Networks?
Zixiang Chen
Yuan Cao
Difan Zou
Quanquan Gu
77
123
0
27 Nov 2019
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent
  Kernel Analysis
Benefits of Jointly Training Autoencoders: An Improved Neural Tangent Kernel Analysis
THANH VAN NGUYEN
Raymond K. W. Wong
Chinmay Hegde
67
13
0
27 Nov 2019
Implicit Regularization and Convergence for Weight Normalization
Implicit Regularization and Convergence for Weight Normalization
Xiaoxia Wu
Yan Sun
Zhaolin Ren
Shanshan Wu
Zhiyuan Li
Suriya Gunasekar
Rachel A. Ward
Qiang Liu
155
21
0
18 Nov 2019
Convex Formulation of Overparameterized Deep Neural Networks
Convex Formulation of Overparameterized Deep Neural Networks
Cong Fang
Yihong Gu
Weizhong Zhang
Tong Zhang
83
28
0
18 Nov 2019
Asymptotics of Reinforcement Learning with Neural Networks
Asymptotics of Reinforcement Learning with Neural Networks
Justin A. Sirignano
K. Spiliopoulos
MLT
98
14
0
13 Nov 2019
Quadratic number of nodes is sufficient to learn a dataset via gradient
  descent
Quadratic number of nodes is sufficient to learn a dataset via gradient descent
Biswarup Das
Eugene Golikov
MLT
21
0
0
13 Nov 2019
Tight Sample Complexity of Learning One-hidden-layer Convolutional
  Neural Networks
Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks
Yuan Cao
Quanquan Gu
MLT
78
19
0
12 Nov 2019
Neural Contextual Bandits with UCB-based Exploration
Neural Contextual Bandits with UCB-based Exploration
Dongruo Zhou
Lihong Li
Quanquan Gu
135
15
0
11 Nov 2019
Stronger Convergence Results for Deep Residual Networks: Network Width
  Scales Linearly with Training Data Size
Stronger Convergence Results for Deep Residual Networks: Network Width Scales Linearly with Training Data Size
Talha Cihad Gulcu
29
0
0
11 Nov 2019
Enhanced Convolutional Neural Tangent Kernels
Enhanced Convolutional Neural Tangent Kernels
Zhiyuan Li
Ruosong Wang
Dingli Yu
S. Du
Wei Hu
Ruslan Salakhutdinov
Sanjeev Arora
76
133
0
03 Nov 2019
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Global Convergence of Gradient Descent for Deep Linear Residual Networks
Lei Wu
Qingcan Wang
Chao Ma
ODLAI4CE
97
22
0
02 Nov 2019
Denoising and Regularization via Exploiting the Structural Bias of
  Convolutional Generators
Denoising and Regularization via Exploiting the Structural Bias of Convolutional Generators
Reinhard Heckel
Mahdi Soltanolkotabi
DiffM
106
83
0
31 Oct 2019
Learning Boolean Circuits with Neural Networks
Learning Boolean Circuits with Neural Networks
Eran Malach
Shai Shalev-Shwartz
62
4
0
25 Oct 2019
Over Parameterized Two-level Neural Networks Can Learn Near Optimal
  Feature Representations
Over Parameterized Two-level Neural Networks Can Learn Near Optimal Feature Representations
Cong Fang
Hanze Dong
Tong Zhang
58
18
0
25 Oct 2019
Capacity, Bandwidth, and Compositionality in Emergent Language Learning
Capacity, Bandwidth, and Compositionality in Emergent Language Learning
Cinjon Resnick
Abhinav Gupta
Jakob N. Foerster
Andrew M. Dai
Kyunghyun Cho
71
50
0
24 Oct 2019
Image recognition from raw labels collected without annotators
Image recognition from raw labels collected without annotators
Fatih Yilmaz
Reinhard Heckel
NoLa
73
7
0
20 Oct 2019
Self-Adaptive Network Pruning
Self-Adaptive Network Pruning
Jinting Chen
Zhaocheng Zhu
Chengwei Li
Yuming Zhao
3DPC
48
22
0
20 Oct 2019
Neural tangent kernels, transportation mappings, and universal
  approximation
Neural tangent kernels, transportation mappings, and universal approximation
Ziwei Ji
Matus Telgarsky
Ruicheng Xian
84
39
0
15 Oct 2019
The Local Elasticity of Neural Networks
The Local Elasticity of Neural Networks
Hangfeng He
Weijie J. Su
149
46
0
15 Oct 2019
Effects of Depth, Width, and Initialization: A Convergence Analysis of
  Layer-wise Training for Deep Linear Neural Networks
Effects of Depth, Width, and Initialization: A Convergence Analysis of Layer-wise Training for Deep Linear Neural Networks
Yeonjong Shin
68
12
0
14 Oct 2019
Nearly Minimal Over-Parametrization of Shallow Neural Networks
Armin Eftekhari
Chaehwan Song
Volkan Cevher
52
1
0
09 Oct 2019
Algorithm-Dependent Generalization Bounds for Overparameterized Deep
  Residual Networks
Algorithm-Dependent Generalization Bounds for Overparameterized Deep Residual Networks
Spencer Frei
Yuan Cao
Quanquan Gu
ODL
82
31
0
07 Oct 2019
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Harnessing the Power of Infinitely Wide Deep Nets on Small-data Tasks
Sanjeev Arora
S. Du
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
Dingli Yu
AAML
78
162
0
03 Oct 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of
  Wide Neural Networks
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Yu Bai
Jason D. Lee
69
116
0
03 Oct 2019
Distillation $\approx$ Early Stopping? Harvesting Dark Knowledge
  Utilizing Anisotropic Information Retrieval For Overparameterized Neural
  Network
Distillation ≈\approx≈ Early Stopping? Harvesting Dark Knowledge Utilizing Anisotropic Information Retrieval For Overparameterized Neural Network
Bin Dong
Jikai Hou
Yiping Lu
Zhihua Zhang
77
41
0
02 Oct 2019
The asymptotic spectrum of the Hessian of DNN throughout training
The asymptotic spectrum of the Hessian of DNN throughout training
Arthur Jacot
Franck Gabriel
Clément Hongler
138
35
0
01 Oct 2019
On the convergence of gradient descent for two layer neural networks
Lei Li
MLT
27
0
0
30 Sep 2019
On the Anomalous Generalization of GANs
On the Anomalous Generalization of GANs
Jinchen Xuan
Yunchang Yang
Ze Yang
Di He
Liwei Wang
38
3
0
27 Sep 2019
Previous
123...1415161718
Next