Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02054
Cited By
v1
v2 (latest)
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Provably Optimizes Over-parameterized Neural Networks"
50 / 882 papers shown
Title
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
54
6
0
08 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
77
5
0
28 Oct 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
69
3
0
28 Oct 2022
Sparsity in Continuous-Depth Neural Networks
H. Aliee
Till Richter
Mikhail Solonin
I. Ibarra
Fabian J. Theis
Niki Kilbertus
97
11
0
26 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Vinay Kumar Verma
Nikhil Mehta
Shijing Si
Ricardo Henao
Lawrence Carin
50
3
0
23 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
64
6
0
20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
109
92
0
18 Oct 2022
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
Wei Hu
MLT
83
42
0
13 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence
Diyuan Wu
Vyacheslav Kungurtsev
Marco Mondelli
60
3
0
13 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
49
4
0
13 Oct 2022
Few-shot Backdoor Attacks via Neural Tangent Kernels
J. Hayase
Sewoong Oh
74
21
0
12 Oct 2022
Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks
Sijia Wang
Yoojin Choi
Junya Chen
Mostafa El-Khamy
Ricardo Henao
CLL
64
0
0
11 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
157
69
0
11 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
98
20
0
11 Oct 2022
Efficient NTK using Dimensionality Reduction
Nir Ailon
Supratim Shit
95
0
0
10 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
81
2
0
05 Oct 2022
Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees
Siliang Zeng
Mingyi Hong
Alfredo García
OffRL
83
12
0
04 Oct 2022
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks
Xiang Wang
Annie Wang
Mo Zhou
Rong Ge
MoMe
231
10
0
03 Oct 2022
A Combinatorial Perspective on the Optimization of Shallow ReLU Networks
Michael Matena
Colin Raffel
38
1
0
01 Oct 2022
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao
Hongyang Gao
MLT
AI4CE
65
3
0
30 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
379
50
0
29 Sep 2022
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks
Yunwen Lei
Rong Jin
Yiming Ying
MLT
100
19
0
19 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
87
6
0
19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
104
7
0
17 Sep 2022
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study
Yongtao Wu
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
85
11
0
16 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
104
21
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
AI4CE
100
17
0
15 Sep 2022
On the Trade-Off between Actionable Explanations and the Right to be Forgotten
Martin Pawelczyk
Tobias Leemann
Asia J. Biega
Gjergji Kasneci
FaML
MU
109
23
0
30 Aug 2022
Neural Tangent Kernel: A Survey
Eugene Golikov
Eduard Pokonechnyy
Vladimir Korviakov
76
14
0
29 Aug 2022
Overparameterization from Computational Constraints
Sanjam Garg
S. Jha
Saeed Mahloujifar
Mohammad Mahmoody
Mingyuan Wang
47
2
0
27 Aug 2022
Universal Solutions of Feedforward ReLU Networks for Interpolations
Changcun Huang
63
2
0
16 Aug 2022
Gaussian Process Surrogate Models for Neural Networks
Michael Y. Li
Erin Grant
Thomas Griffiths
BDL
SyDa
109
8
0
11 Aug 2022
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
77
25
0
10 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
99
28
0
09 Aug 2022
Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions
Nikolaos Karalias
Joshua Robinson
Andreas Loukas
Stefanie Jegelka
128
9
0
08 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
84
1
0
08 Aug 2022
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver
Zhongzhan Huang
Senwei Liang
Hong Zhang
Haizhao Yang
Liang Lin
AI4CE
101
9
0
07 Aug 2022
Federated Adversarial Learning: A Framework with Convergence Analysis
Xiaoxiao Li
Zhao Song
Jiaming Yang
FedML
92
21
0
07 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLT
MoE
97
60
0
04 Aug 2022
Agnostic Learning of General ReLU Activation Using Gradient Descent
Pranjal Awasthi
Alex K. Tang
Aravindan Vijayaraghavan
MLT
64
7
0
04 Aug 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
108
5
0
03 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
97
47
0
26 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
114
133
0
18 Jul 2022
The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural Network
Zhongzhan Huang
Senwei Liang
Mingfu Liang
Wei He
Haizhao Yang
Liang Lin
72
9
0
16 Jul 2022
Riemannian Natural Gradient Methods
Jiang Hu
Ruicheng Ao
Anthony Man-Cho So
Minghan Yang
Zaiwen Wen
67
11
0
15 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh Chawla
66
7
0
13 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
110
29
0
08 Jul 2022
Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data
Hongkang Li
Shuai Zhang
Ming Wang
MLT
75
8
0
07 Jul 2022
Neural Stein critics with staged
L
2
L^2
L
2
-regularization
Matthew Repasky
Xiuyuan Cheng
Yao Xie
61
3
0
07 Jul 2022
Previous
1
2
3
...
5
6
7
...
16
17
18
Next