Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1812.10004
Cited By
Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?
25 December 2018
Samet Oymak
Mahdi Soltanolkotabi
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?"
50 / 60 papers shown
Title
Coreset-Based Task Selection for Sample-Efficient Meta-Reinforcement Learning
Donglin Zhan
Leonardo F. Toso
James Anderson
211
3
0
04 Feb 2025
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLL
LRM
296
4
0
29 Aug 2024
Reparameterization invariance in approximate Bayesian inference
Hrittik Roy
M. Miani
Carl Henrik Ek
Philipp Hennig
Marvin Pfortner
Lukas Tatzel
Søren Hauberg
BDL
104
9
0
05 Jun 2024
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
Aaron Mishkin
Mert Pilanci
Mark Schmidt
123
1
0
03 Apr 2024
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
122
354
0
27 Mar 2019
Fitting ReLUs via SGD and Quantized SGD
Seyed Mohammadreza Mousavi Kalan
Mahdi Soltanolkotabi
A. Avestimehr
73
24
0
19 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
215
448
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
205
775
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
279
1,469
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
251
1,136
0
09 Nov 2018
On exponential convergence of SGD in non-convex over-parametrized learning
Xinhai Liu
M. Belkin
Yu-Shen Liu
80
103
0
06 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
Sharan Vaswani
Francis R. Bach
Mark Schmidt
102
301
0
16 Oct 2018
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem
Alon Brutzkus
Amir Globerson
MLT
55
7
0
06 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
245
1,276
0
04 Oct 2018
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
123
257
0
04 Oct 2018
Stochastic Gradient Descent Learns State Equations with Nonlinear Activations
Samet Oymak
63
43
0
09 Sep 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
222
653
0
03 Aug 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize
Tengyuan Liang
Alexander Rakhlin
89
355
0
01 Aug 2018
Does data interpolation contradict statistical optimality?
M. Belkin
Alexander Rakhlin
Alexandre B. Tsybakov
95
221
0
25 Jun 2018
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate
M. Belkin
Daniel J. Hsu
P. Mitra
AI4CE
155
259
0
13 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedML
MLT
90
102
0
05 Jun 2018
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Navid Azizan
B. Hassibi
81
64
0
04 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
225
737
0
24 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks
Zhihui Zhu
Daniel Soudry
Yonina C. Eldar
M. Wakin
ODL
82
36
0
13 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
109
863
0
18 Apr 2018
Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval
Yuxin Chen
Yuejie Chi
Jianqing Fan
Cong Ma
73
237
0
21 Mar 2018
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
Sanjeev Arora
Nadav Cohen
Elad Hazan
120
487
0
19 Feb 2018
Spurious Valleys in Two-layer Neural Network Optimization Landscapes
Luca Venturi
Afonso S. Bandeira
Joan Bruna
76
74
0
18 Feb 2018
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLT
AI4CE
95
643
0
14 Feb 2018
Learning Compact Neural Networks with Regularization
Samet Oymak
MLT
101
39
0
05 Feb 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Siyuan Ma
Raef Bassily
M. Belkin
103
291
0
18 Dec 2017
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
161
551
0
18 Dec 2017
Learning One-hidden-layer Neural Networks with Landscape Design
Rong Ge
Jason D. Lee
Tengyu Ma
MLT
208
262
0
01 Nov 2017
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
178
924
0
27 Oct 2017
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
156
279
0
27 Oct 2017
Gradient Methods for Submodular Maximization
Hamed Hassani
Mahdi Soltanolkotabi
Amin Karbasi
58
133
0
13 Aug 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
Jason D. Lee
187
423
0
16 Jul 2017
Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees
Yan Shuo Tan
Roman Vershynin
78
102
0
30 Jun 2017
Spectrally-normalized margin bounds for neural networks
Peter L. Bartlett
Dylan J. Foster
Matus Telgarsky
ODL
218
1,225
0
26 Jun 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Levent Sagun
Utku Evci
V. U. Güney
Yann N. Dauphin
Léon Bottou
74
419
0
14 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
183
337
0
10 Jun 2017
Implicit Regularization in Matrix Factorization
Suriya Gunasekar
Blake E. Woodworth
Srinadh Bhojanapalli
Behnam Neyshabur
Nathan Srebro
81
493
0
25 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
91
1,034
0
23 May 2017
Learning ReLUs via Gradient Descent
Mahdi Soltanolkotabi
MLT
96
183
0
10 May 2017
Geometry of Optimization and Implicit Regularization in Deep Learning
Behnam Neyshabur
Ryota Tomioka
Ruslan Salakhutdinov
Nathan Srebro
AI4CE
79
134
0
08 May 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Alon Brutzkus
Amir Globerson
MLT
176
313
0
26 Feb 2017
Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization
Mahdi Soltanolkotabi
68
102
0
20 Feb 2017
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
356
4,636
0
10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
98
775
0
06 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
442
2,946
0
15 Sep 2016
1
2
Next