Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1710.10174
Cited By
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
27 October 2017
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data"
50 / 60 papers shown
Title
Minimax Optimal Convergence of Gradient Descent in Logistic Regression via Large and Adaptive Stepsizes
Ruiqi Zhang
Jingfeng Wu
Licong Lin
Peter L. Bartlett
28
0
0
05 Apr 2025
SCoTTi: Save Computation at Training Time with an adaptive framework
Ziyu Li
Enzo Tartaglione
Van-Tam Nguyen
33
0
0
19 Dec 2023
Polynomially Over-Parameterized Convolutional Neural Networks Contain Structured Strong Winning Lottery Tickets
A. D. Cunha
Francesco d’Amore
Emanuele Natale
MLT
24
1
0
16 Nov 2023
Early Neuron Alignment in Two-layer ReLU Networks with Small Initialization
Hancheng Min
Enrique Mallada
René Vidal
MLT
34
18
0
24 Jul 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
27
3
0
26 May 2023
Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off
Shaoyi Huang
Bowen Lei
Dongkuan Xu
Hongwu Peng
Yue Sun
Mimi Xie
Caiwen Ding
23
19
0
30 Nov 2022
Do highly over-parameterized neural networks generalize since bad solutions are rare?
Julius Martinetz
T. Martinetz
24
1
0
07 Nov 2022
Sparsity in Continuous-Depth Neural Networks
H. Aliee
Till Richter
Mikhail Solonin
I. Ibarra
Fabian J. Theis
Niki Kilbertus
29
10
0
26 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
39
84
0
18 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Implicit Full Waveform Inversion with Deep Neural Representation
Jian Sun
K. Innanen
AI4CE
37
32
0
08 Sep 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Deep Layer-wise Networks Have Closed-Form Weights
Chieh-Tsai Wu
A. Masoomi
A. Gretton
Jennifer Dy
29
3
0
01 Feb 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
33
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
Hao Wu
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
41
22
0
21 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Regularization by Misclassification in ReLU Neural Networks
Elisabetta Cornacchia
Jan Hązła
Ido Nachum
Amir Yehudayoff
NoLa
23
2
0
03 Nov 2021
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
29
16
0
18 Oct 2021
Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs
Tolga Ergen
Mert Pilanci
OffRL
MLT
27
32
0
11 Oct 2021
Theory of overparametrization in quantum neural networks
Martín Larocca
Nathan Ju
Diego García-Martín
Patrick J. Coles
M. Cerezo
37
188
0
23 Sep 2021
Proxy Convexity: A Unified Framework for the Analysis of Neural Networks Trained by Gradient Descent
Spencer Frei
Quanquan Gu
26
25
0
25 Jun 2021
The Implicit Bias for Adaptive Optimization Algorithms on Homogeneous Neural Networks
Bohan Wang
Qi Meng
Wei Chen
Tie-Yan Liu
22
33
0
11 Dec 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
50
257
0
18 Nov 2020
LOss-Based SensiTivity rEgulaRization: towards deep sparse neural networks
Enzo Tartaglione
Andrea Bragagnolo
A. Fiandrotti
Marco Grangetto
ODL
UQCV
15
34
0
16 Nov 2020
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Biwei Huang
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
24
26
0
22 Oct 2020
Predicting Training Time Without Training
L. Zancato
Alessandro Achille
Avinash Ravichandran
Rahul Bhotika
Stefano Soatto
18
24
0
28 Aug 2020
Neural Anisotropy Directions
Guillermo Ortiz-Jiménez
Apostolos Modas
Seyed-Mohsen Moosavi-Dezfooli
P. Frossard
28
16
0
17 Jun 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
32
147
0
20 May 2020
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
28
13
0
23 Mar 2020
Convex Geometry and Duality of Over-parameterized Neural Networks
Tolga Ergen
Mert Pilanci
MLT
34
54
0
25 Feb 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLT
AI4CE
11
4
0
22 Feb 2020
Learning Parities with Neural Networks
Amit Daniely
Eran Malach
24
76
0
18 Feb 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity
Shiyu Liang
Ruoyu Sun
R. Srikant
32
19
0
31 Dec 2019
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
14
168
0
19 Dec 2019
How does topology influence gradient propagation and model performance of deep networks with DenseNet-type skip connections?
Kartikeya Bhardwaj
Guihong Li
R. Marculescu
30
1
0
02 Oct 2019
Neural ODEs as the Deep Limit of ResNets with constant weights
B. Avelin
K. Nystrom
ODL
37
31
0
28 Jun 2019
On the Noisy Gradient Descent that Generalizes as SGD
Jingfeng Wu
Wenqing Hu
Haoyi Xiong
Jun Huan
Vladimir Braverman
Zhanxing Zhu
MLT
24
10
0
18 Jun 2019
Gradient Descent can Learn Less Over-parameterized Two-layer Neural Networks on Classification Problems
Atsushi Nitanda
Geoffrey Chinot
Taiji Suzuki
MLT
13
33
0
23 May 2019
Data-dependent Sample Complexity of Deep Neural Networks via Lipschitz Augmentation
Colin Wei
Tengyu Ma
14
109
0
09 May 2019
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
33
351
0
27 Mar 2019
Is Deeper Better only when Shallow is Good?
Eran Malach
Shai Shalev-Shwartz
25
45
0
08 Mar 2019
A Priori Estimates of the Population Risk for Residual Networks
E. Weinan
Chao Ma
Qingcan Wang
UQCV
31
61
0
06 Mar 2019
Copying Machine Learning Classifiers
Irene Unceta
Jordi Nin
O. Pujol
6
18
0
05 Mar 2019
Parameter Efficient Training of Deep Convolutional Neural Networks by Dynamic Sparse Reparameterization
Hesham Mostafa
Xin Wang
29
307
0
15 Feb 2019
On a Sparse Shortcut Topology of Artificial Neural Networks
Fenglei Fan
Dayang Wang
Hengtao Guo
Qikui Zhu
Pingkun Yan
Ge Wang
Hengyong Yu
38
21
0
22 Nov 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
22
446
0
21 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao-quan Song
18
191
0
29 Oct 2018
Subgradient Descent Learns Orthogonal Dictionaries
Yu Bai
Qijia Jiang
Ju Sun
10
51
0
25 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
18
117
0
17 Oct 2018
1
2
Next