Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02054
Cited By
v1
v2 (latest)
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Provably Optimizes Over-parameterized Neural Networks"
50 / 882 papers shown
Title
Extreme Memorization via Scale of Initialization
Harsh Mehta
Ashok Cutkosky
Behnam Neyshabur
60
20
0
31 Aug 2020
Predicting Training Time Without Training
Luca Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
156
24
0
28 Aug 2020
Deep Networks and the Multiple Manifold Problem
Sam Buchanan
D. Gilboa
John N. Wright
234
39
0
25 Aug 2020
Asymptotics of Wide Convolutional Neural Networks
Anders Andreassen
Ethan Dyer
74
23
0
19 Aug 2020
The Neural Tangent Kernel in High Dimensions: Triple Descent and a Multi-Scale Theory of Generalization
Ben Adlam
Jeffrey Pennington
61
125
0
15 Aug 2020
On the Generalization Properties of Adversarial Training
Yue Xing
Qifan Song
Guang Cheng
AAML
78
34
0
15 Aug 2020
Adversarial Training and Provable Robustness: A Tale of Two Objectives
Jiameng Fan
Wenchao Li
AAML
51
21
0
13 Aug 2020
Implicit Regularization via Neural Feature Alignment
A. Baratin
Thomas George
César Laurent
R. Devon Hjelm
Guillaume Lajoie
Pascal Vincent
Simon Lacoste-Julien
73
6
0
03 Aug 2020
Finite Versus Infinite Neural Networks: an Empirical Study
Jaehoon Lee
S. Schoenholz
Jeffrey Pennington
Ben Adlam
Lechao Xiao
Roman Novak
Jascha Narain Sohl-Dickstein
87
214
0
31 Jul 2020
On the Banach spaces associated with multi-layer ReLU networks: Function representation, approximation theory and gradient descent dynamics
E. Weinan
Stephan Wojtowytsch
MLT
75
53
0
30 Jul 2020
The Interpolation Phase Transition in Neural Networks: Memorization and Generalization under Lazy Training
Andrea Montanari
Yiqiao Zhong
190
97
0
25 Jul 2020
Geometric compression of invariant manifolds in neural nets
J. Paccolat
Leonardo Petrini
Mario Geiger
Kevin Tyloo
Matthieu Wyart
MLT
105
36
0
22 Jul 2020
Early Stopping in Deep Networks: Double Descent and How to Eliminate it
Reinhard Heckel
Fatih Yilmaz
80
45
0
20 Jul 2020
Understanding Implicit Regularization in Over-Parameterized Single Index Model
Jianqing Fan
Zhuoran Yang
Mengxin Yu
81
18
0
16 Jul 2020
Plateau Phenomenon in Gradient Descent Training of ReLU networks: Explanation, Quantification and Avoidance
M. Ainsworth
Yeonjong Shin
ODL
43
21
0
14 Jul 2020
Implicit Bias in Deep Linear Classification: Initialization Scale vs Training Accuracy
E. Moroshko
Suriya Gunasekar
Blake E. Woodworth
Jason D. Lee
Nathan Srebro
Daniel Soudry
89
86
0
13 Jul 2020
Generalization bound of globally optimal non-convex neural network training: Transportation map estimation by infinite dimensional Langevin dynamics
Taiji Suzuki
65
21
0
11 Jul 2020
Maximum-and-Concatenation Networks
Xingyu Xie
Hao Kong
Jianlong Wu
Wayne Zhang
Guangcan Liu
Zhouchen Lin
157
2
0
09 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
95
27
0
09 Jul 2020
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
Yuqing Li
Yaoyu Zhang
N. Yip
55
5
0
07 Jul 2020
Bespoke vs. Prêt-à-Porter Lottery Tickets: Exploiting Mask Similarity for Trainable Sub-Network Finding
Michela Paganini
Jessica Zosa Forde
UQCV
51
6
0
06 Jul 2020
Regularization Matters: A Nonparametric Perspective on Overparametrized Neural Network
Tianyang Hu
Wei Cao
Cong Lin
Guang Cheng
118
52
0
06 Jul 2020
Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks
Cong Fang
Jason D. Lee
Pengkun Yang
Tong Zhang
OOD
FedML
156
58
0
03 Jul 2020
A Revision of Neural Tangent Kernel-based Approaches for Neural Networks
Kyungsu Kim
A. Lozano
Eunho Yang
AAML
87
0
0
02 Jul 2020
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks
Denny Zhou
Mao Ye
Chen Chen
Tianjian Meng
Mingxing Tan
Xiaodan Song
Quoc V. Le
Qiang Liu
Dale Schuurmans
63
20
0
01 Jul 2020
Statistical Mechanical Analysis of Neural Network Pruning
Rupam Acharyya
Ankani Chattoraj
Boyu Zhang
Shouman Das
Daniel Stefankovic
34
0
0
30 Jun 2020
Theory-Inspired Path-Regularized Differential Network Architecture Search
Pan Zhou
Caiming Xiong
R. Socher
Guosheng Lin
52
56
0
30 Jun 2020
Two-Layer Neural Networks for Partial Differential Equations: Optimization and Generalization Theory
Yaoyu Zhang
Haizhao Yang
90
76
0
28 Jun 2020
Offline Contextual Bandits with Overparameterized Models
David Brandfonbrener
William F. Whitney
Rajesh Ranganath
Joan Bruna
OffRL
90
10
0
27 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
129
139
0
25 Jun 2020
The Quenching-Activation Behavior of the Gradient Descent Dynamics for Two-layer Neural Network Models
Chao Ma
Lei Wu
E. Weinan
MLT
121
11
0
25 Jun 2020
Towards Understanding Hierarchical Learning: Benefits of Neural Representations
Minshuo Chen
Yu Bai
Jason D. Lee
T. Zhao
Huan Wang
Caiming Xiong
R. Socher
SSL
91
49
0
24 Jun 2020
When Do Neural Networks Outperform Kernel Methods?
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
135
189
0
24 Jun 2020
On the Global Optimality of Model-Agnostic Meta-Learning
Lingxiao Wang
Qi Cai
Zhuoran Yang
Zhaoran Wang
76
44
0
23 Jun 2020
Optimal Rates for Averaged Stochastic Gradient Descent under Neural Tangent Kernel Regime
Atsushi Nitanda
Taiji Suzuki
91
41
0
22 Jun 2020
Generalisation Guarantees for Continual Learning with Orthogonal Gradient Descent
Mehdi Abbana Bennani
Thang Doan
Masashi Sugiyama
CLL
135
65
0
21 Jun 2020
Training (Overparametrized) Neural Networks in Near-Linear Time
Jan van den Brand
Binghui Peng
Zhao Song
Omri Weinstein
ODL
91
83
0
20 Jun 2020
An analytic theory of shallow networks dynamics for hinge loss classification
Franco Pellegrini
Giulio Biroli
82
19
0
19 Jun 2020
Exploring Weight Importance and Hessian Bias in Model Pruning
Mingchen Li
Yahya Sattar
Christos Thrampoulidis
Samet Oymak
71
4
0
19 Jun 2020
Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains
Matthew Tancik
Pratul P. Srinivasan
B. Mildenhall
Sara Fridovich-Keil
N. Raghavan
Utkarsh Singhal
R. Ramamoorthi
Jonathan T. Barron
Ren Ng
135
2,458
0
18 Jun 2020
Revisiting minimum description length complexity in overparameterized models
Raaz Dwivedi
Chandan Singh
Bin Yu
Martin J. Wainwright
72
5
0
17 Jun 2020
Kernel Alignment Risk Estimator: Risk Prediction from Training Data
Arthur Jacot
Berfin cSimcsek
Francesco Spadaro
Clément Hongler
Franck Gabriel
80
68
0
17 Jun 2020
Directional Pruning of Deep Neural Networks
Shih-Kang Chao
Zhanyu Wang
Yue Xing
Guang Cheng
ODL
76
33
0
16 Jun 2020
Optimization and Generalization Analysis of Transduction through Gradient Boosting and Application to Multi-scale Graph Neural Networks
Kenta Oono
Taiji Suzuki
AI4CE
126
32
0
15 Jun 2020
Global Convergence of Sobolev Training for Overparameterized Neural Networks
Jorio Cocola
Paul Hand
25
6
0
14 Jun 2020
Minimax Estimation of Conditional Moment Models
Nishanth Dikkala
Greg Lewis
Lester W. Mackey
Vasilis Syrgkanis
220
103
0
12 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
80
37
0
12 Jun 2020
Optimization Theory for ReLU Neural Networks Trained with Normalization Layers
Yonatan Dukler
Quanquan Gu
Guido Montúfar
83
30
0
11 Jun 2020
Tangent Space Sensitivity and Distribution of Linear Regions in ReLU Networks
Balint Daroczy
AAML
17
0
0
11 Jun 2020
Asymptotics of Ridge (less) Regression under General Source Condition
Dominic Richards
Jaouad Mourtada
Lorenzo Rosasco
88
73
0
11 Jun 2020
Previous
1
2
3
...
12
13
14
...
16
17
18
Next