ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Extreme Memorization via Scale of Initialization
Extreme Memorization via Scale of Initialization
Harsh Mehta
Ashok Cutkosky
Behnam Neyshabur
60
20
0
31 Aug 2020
Predicting Training Time Without Training
Predicting Training Time Without Training
Luca Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
156
24
0
28 Aug 2020
Deep Networks and the Multiple Manifold Problem
Deep Networks and the Multiple Manifold Problem
Sam Buchanan
D. Gilboa
John N. Wright
234
39
0
25 Aug 2020
Asymptotics of Wide Convolutional Neural Networks
Asymptotics of Wide Convolutional Neural Networks
Anders Andreassen
Ethan Dyer
74
23
0
19 Aug 2020
The Neural Tangent Kernel in High Dimensions: Triple Descent and a
  Multi-Scale Theory of Generalization
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
Ben Adlam
Jeffrey Pennington
61
125
0
15 Aug 2020
On the Generalization Properties of Adversarial Training
On the Generalization Properties of Adversarial Training
Yue Xing
Qifan Song
Guang Cheng
AAML
78
34
0
15 Aug 2020
Adversarial Training and Provable Robustness: A Tale of Two Objectives
Adversarial Training and Provable Robustness: A Tale of Two Objectives
Jiameng Fan
Wenchao Li
AAML
51
21
0
13 Aug 2020
Implicit Regularization via Neural Feature Alignment
Implicit Regularization via Neural Feature Alignment
A. Baratin
Thomas George
César Laurent
R. Devon Hjelm
Guillaume Lajoie
Pascal Vincent
Simon Lacoste-Julien
73
6
0
03 Aug 2020
Finite Versus Infinite Neural Networks: an Empirical Study
Finite Versus Infinite Neural Networks: an Empirical Study
Jaehoon Lee
S. Schoenholz
Jeffrey Pennington
Ben Adlam
Lechao Xiao
Roman Novak
Jascha Narain Sohl-Dickstein
87
214
0
31 Jul 2020
On the Banach spaces associated with multi-layer ReLU networks: Function
  representation, approximation theory and gradient descent dynamics
On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics
E. Weinan
Stephan Wojtowytsch
MLT
75
53
0
30 Jul 2020
The Interpolation Phase Transition in Neural Networks: Memorization and
  Generalization under Lazy Training
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training
Andrea Montanari
Yiqiao Zhong
190
97
0
25 Jul 2020
Geometric compression of invariant manifolds in neural nets
Geometric compression of invariant manifolds in neural nets
J. Paccolat
Leonardo Petrini
Mario Geiger
Kevin Tyloo
Matthieu Wyart
MLT
105
36
0
22 Jul 2020
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Reinhard Heckel
Fatih Yilmaz
80
45
0
20 Jul 2020
Understanding Implicit Regularization in Over-Parameterized Single Index
  Model
Understanding Implicit Regularization in Over-Parameterized Single Index Model
Jianqing Fan
Zhuoran Yang
Mengxin Yu
81
18
0
16 Jul 2020
Plateau Phenomenon in Gradient Descent Training of ReLU networks:
  Explanation, Quantification and Avoidance
Plateau Phenomenon in Gradient Descent Training of ReLU networks: Explanation, Quantification and Avoidance
M. Ainsworth
Yeonjong Shin
ODL
51
21
0
14 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs
  Training Accuracy
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
Jason D. Lee
Nathan Srebro
Daniel Soudry
89
86
0
13 Jul 2020
Generalization bound of globally optimal non-convex neural network
  training: Transportation map estimation by infinite dimensional Langevin
  dynamics
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics
Taiji Suzuki
65
21
0
11 Jul 2020
Maximum-and-Concatenation Networks
Maximum-and-Concatenation Networks
Xingyu Xie
Hao Kong
Jianlong Wu
Wayne Zhang
Guangcan Liu
Zhouchen Lin
157
2
0
09 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
95
27
0
09 Jul 2020
Towards an Understanding of Residual Networks Using Neural Tangent
  Hierarchy (NTH)
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
Yuqing Li
Yaoyu Zhang
N. Yip
55
5
0
07 Jul 2020
Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask
  Similarity for Trainable Sub-Network Finding
Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding
Michela Paganini
Jessica Zosa Forde
UQCV
51
6
0
06 Jul 2020
Regularization Matters: A Nonparametric Perspective on Overparametrized
  Neural Network
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network
Tianyang Hu
Wei Cao
Cong Lin
Guang Cheng
118
52
0
06 Jul 2020
Modeling from Features: a Mean-field Framework for Over-parameterized
  Deep Neural Networks
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks
Cong Fang
Jason D. Lee
Pengkun Yang
Tong Zhang
OODFedML
156
58
0
03 Jul 2020
A Revision of Neural Tangent Kernel-based Approaches for Neural Networks
Kyungsu Kim
A. Lozano
Eunho Yang
AAML
89
0
0
02 Jul 2020
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Denny Zhou
Mao Ye
Chen Chen
Tianjian Meng
Mingxing Tan
Xiaodan Song
Quoc V. Le
Qiang Liu
Dale Schuurmans
63
20
0
01 Jul 2020
Statistical Mechanical Analysis of Neural Network Pruning
Statistical Mechanical Analysis of Neural Network Pruning
Rupam Acharyya
Ankani Chattoraj
Boyu Zhang
Shouman Das
Daniel Stefankovic
34
0
0
30 Jun 2020
Theory-Inspired Path-Regularized Differential Network Architecture
  Search
Theory-Inspired Path-Regularized Differential Network Architecture Search
Pan Zhou
Caiming Xiong
R. Socher
Guosheng Lin
61
56
0
30 Jun 2020
Two-Layer Neural Networks for Partial Differential Equations:
  Optimization and Generalization Theory
Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory
Yaoyu Zhang
Haizhao Yang
93
76
0
28 Jun 2020
Offline Contextual Bandits with Overparameterized Models
Offline Contextual Bandits with Overparameterized Models
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
90
10
0
27 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
129
139
0
25 Jun 2020
The Quenching-Activation Behavior of the Gradient Descent Dynamics for
  Two-layer Neural Network Models
The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
Chao Ma
Lei Wu
E. Weinan
MLT
121
11
0
25 Jun 2020
Towards Understanding Hierarchical Learning: Benefits of Neural
  Representations
Towards Understanding Hierarchical Learning: Benefits of Neural Representations
Minshuo Chen
Yu Bai
Jason D. Lee
T. Zhao
Huan Wang
Caiming Xiong
R. Socher
SSL
93
49
0
24 Jun 2020
When Do Neural Networks Outperform Kernel Methods?
When Do Neural Networks Outperform Kernel Methods?
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
137
189
0
24 Jun 2020
On the Global Optimality of Model-Agnostic Meta-Learning
On the Global Optimality of Model-Agnostic Meta-Learning
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
76
44
0
23 Jun 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural
  Tangent Kernel Regime
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda
Taiji Suzuki
91
41
0
22 Jun 2020
Generalisation Guarantees for Continual Learning with Orthogonal
  Gradient Descent
Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent
Mehdi Abbana Bennani
Thang Doan
Masashi Sugiyama
CLL
135
65
0
21 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
91
83
0
20 Jun 2020
An analytic theory of shallow networks dynamics for hinge loss
  classification
An analytic theory of shallow networks dynamics for hinge loss classification
Franco Pellegrini
Giulio Biroli
82
19
0
19 Jun 2020
Exploring Weight Importance and Hessian Bias in Model Pruning
Exploring Weight Importance and Hessian Bias in Model Pruning
Mingchen Li
Yahya Sattar
Christos Thrampoulidis
Samet Oymak
71
4
0
19 Jun 2020
Fourier Features Let Networks Learn High Frequency Functions in Low
  Dimensional Domains
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
Matthew Tancik
Pratul P. Srinivasan
B. Mildenhall
Sara Fridovich-Keil
N. Raghavan
Utkarsh Singhal
R. Ramamoorthi
Jonathan T. Barron
Ren Ng
135
2,461
0
18 Jun 2020
Revisiting minimum description length complexity in overparameterized
  models
Revisiting minimum description length complexity in overparameterized models
Raaz Dwivedi
Chandan Singh
Bin Yu
Martin J. Wainwright
72
5
0
17 Jun 2020
Kernel Alignment Risk Estimator: Risk Prediction from Training Data
Kernel Alignment Risk Estimator: Risk Prediction from Training Data
Arthur Jacot
Berfin cSimcsek
Francesco Spadaro
Clément Hongler
Franck Gabriel
80
68
0
17 Jun 2020
Directional Pruning of Deep Neural Networks
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
76
33
0
16 Jun 2020
Optimization and Generalization Analysis of Transduction through
  Gradient Boosting and Application to Multi-scale Graph Neural Networks
Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks
Kenta Oono
Taiji Suzuki
AI4CE
128
32
0
15 Jun 2020
Global Convergence of Sobolev Training for Overparameterized Neural
  Networks
Global Convergence of Sobolev Training for Overparameterized Neural Networks
Jorio Cocola
Paul Hand
25
6
0
14 Jun 2020
Minimax Estimation of Conditional Moment Models
Minimax Estimation of Conditional Moment Models
Nishanth Dikkala
Greg Lewis
Lester W. Mackey
Vasilis Syrgkanis
222
103
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep
  neural networks
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
80
37
0
12 Jun 2020
Optimization Theory for ReLU Neural Networks Trained with Normalization
  Layers
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers
Yonatan Dukler
Quanquan Gu
Guido Montúfar
83
30
0
11 Jun 2020
Tangent Space Sensitivity and Distribution of Linear Regions in ReLU
  Networks
Tangent Space Sensitivity and Distribution of Linear Regions in ReLU Networks
Balint Daroczy
AAML
17
0
0
11 Jun 2020
Asymptotics of Ridge (less) Regression under General Source Condition
Asymptotics of Ridge (less) Regression under General Source Condition
Dominic Richards
Jaouad Mourtada
Lorenzo Rosasco
88
73
0
11 Jun 2020
Previous
123...121314...161718
Next