ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1812.10004
  4. Cited By
Overparameterized Nonlinear Learning: Gradient Descent Takes the
  Shortest Path?

Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?

25 December 2018
Samet Oymak
Mahdi Soltanolkotabi
    ODL
ArXiv (abs)PDFHTML

Papers citing "Overparameterized Nonlinear Learning: Gradient Descent Takes the Shortest Path?"

50 / 60 papers shown
Title
Coreset-Based Task Selection for Sample-Efficient Meta-Reinforcement Learning
Coreset-Based Task Selection for Sample-Efficient Meta-Reinforcement Learning
Donglin Zhan
Leonardo F. Toso
James Anderson
211
3
0
04 Feb 2025
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning
Mohammadamin Banayeeanzade
Mahdi Soltanolkotabi
Mohammad Rostami
CLLLRM
296
4
0
29 Aug 2024
Reparameterization invariance in approximate Bayesian inference
Reparameterization invariance in approximate Bayesian inference
Hrittik Roy
M. Miani
Carl Henrik Ek
Philipp Hennig
Marvin Pfortner
Lukas Tatzel
Søren Hauberg
BDL
104
9
0
05 Jun 2024
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
Faster Convergence of Stochastic Accelerated Gradient Descent under Interpolation
Aaron Mishkin
Mert Pilanci
Mark Schmidt
123
1
0
03 Apr 2024
Gradient Descent with Early Stopping is Provably Robust to Label Noise
  for Overparameterized Neural Networks
Gradient Descent with Early Stopping is Provably Robust to Label Noise for Overparameterized Neural Networks
Mingchen Li
Mahdi Soltanolkotabi
Samet Oymak
NoLa
122
354
0
27 Mar 2019
Fitting ReLUs via SGD and Quantized SGD
Fitting ReLUs via SGD and Quantized SGD
Seyed Mohammadreza Mousavi Kalan
Mahdi Soltanolkotabi
A. Avestimehr
73
24
0
19 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
215
448
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going
  Beyond Two Layers
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
205
775
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CEODL
279
1,469
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
251
1,136
0
09 Nov 2018
On exponential convergence of SGD in non-convex over-parametrized
  learning
On exponential convergence of SGD in non-convex over-parametrized learning
Xinhai Liu
M. Belkin
Yu-Shen Liu
80
103
0
06 Nov 2018
Fast and Faster Convergence of SGD for Over-Parameterized Models and an
  Accelerated Perceptron
Fast and Faster Convergence of SGD for Over-Parameterized Models and an Accelerated Perceptron
Sharan Vaswani
Francis R. Bach
Mark Schmidt
102
301
0
16 Oct 2018
Why do Larger Models Generalize Better? A Theoretical Perspective via
  the XOR Problem
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem
Alon Brutzkus
Amir Globerson
MLT
55
7
0
06 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLTODL
245
1,276
0
04 Oct 2018
Gradient descent aligns the layers of deep linear networks
Gradient descent aligns the layers of deep linear networks
Ziwei Ji
Matus Telgarsky
123
257
0
04 Oct 2018
Stochastic Gradient Descent Learns State Equations with Nonlinear
  Activations
Stochastic Gradient Descent Learns State Equations with Nonlinear Activations
Samet Oymak
63
43
0
09 Sep 2018
Learning Overparameterized Neural Networks via Stochastic Gradient
  Descent on Structured Data
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
222
653
0
03 Aug 2018
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize
Just Interpolate: Kernel "Ridgeless" Regression Can Generalize
Tengyuan Liang
Alexander Rakhlin
89
355
0
01 Aug 2018
Does data interpolation contradict statistical optimality?
Does data interpolation contradict statistical optimality?
M. Belkin
Alexander Rakhlin
Alexandre B. Tsybakov
95
221
0
25 Jun 2018
Overfitting or perfect fitting? Risk bounds for classification and
  regression rules that interpolate
Overfitting or perfect fitting? Risk bounds for classification and regression rules that interpolate
M. Belkin
Daniel J. Hsu
P. Mitra
AI4CE
155
259
0
13 Jun 2018
Stochastic Gradient Descent on Separable Data: Exact Convergence with a
  Fixed Learning Rate
Stochastic Gradient Descent on Separable Data: Exact Convergence with a Fixed Learning Rate
Mor Shpigel Nacson
Nathan Srebro
Daniel Soudry
FedMLMLT
90
102
0
05 Jun 2018
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit
  Regularization
Stochastic Gradient/Mirror Descent: Minimax Optimality and Implicit Regularization
Navid Azizan
B. Hassibi
81
64
0
04 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized
  Models using Optimal Transport
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
225
737
0
24 May 2018
The Global Optimization Geometry of Shallow Linear Neural Networks
The Global Optimization Geometry of Shallow Linear Neural Networks
Zhihui Zhu
Daniel Soudry
Yonina C. Eldar
M. Wakin
ODL
82
36
0
13 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
109
863
0
18 Apr 2018
Gradient Descent with Random Initialization: Fast Global Convergence for
  Nonconvex Phase Retrieval
Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval
Yuxin Chen
Yuejie Chi
Jianqing Fan
Cong Ma
73
237
0
21 Mar 2018
On the Optimization of Deep Networks: Implicit Acceleration by
  Overparameterization
On the Optimization of Deep Networks: Implicit Acceleration by Overparameterization
Sanjeev Arora
Nadav Cohen
Elad Hazan
120
487
0
19 Feb 2018
Spurious Valleys in Two-layer Neural Network Optimization Landscapes
Spurious Valleys in Two-layer Neural Network Optimization Landscapes
Luca Venturi
Afonso S. Bandeira
Joan Bruna
76
74
0
18 Feb 2018
Stronger generalization bounds for deep nets via a compression approach
Stronger generalization bounds for deep nets via a compression approach
Sanjeev Arora
Rong Ge
Behnam Neyshabur
Yi Zhang
MLTAI4CE
95
643
0
14 Feb 2018
Learning Compact Neural Networks with Regularization
Learning Compact Neural Networks with Regularization
Samet Oymak
MLT
101
39
0
05 Feb 2018
The Power of Interpolation: Understanding the Effectiveness of SGD in
  Modern Over-parametrized Learning
The Power of Interpolation: Understanding the Effectiveness of SGD in Modern Over-parametrized Learning
Siyuan Ma
Raef Bassily
M. Belkin
103
291
0
18 Dec 2017
Size-Independent Sample Complexity of Neural Networks
Size-Independent Sample Complexity of Neural Networks
Noah Golowich
Alexander Rakhlin
Ohad Shamir
161
551
0
18 Dec 2017
Learning One-hidden-layer Neural Networks with Landscape Design
Learning One-hidden-layer Neural Networks with Landscape Design
Rong Ge
Jason D. Lee
Tengyu Ma
MLT
208
262
0
01 Nov 2017
The Implicit Bias of Gradient Descent on Separable Data
The Implicit Bias of Gradient Descent on Separable Data
Daniel Soudry
Elad Hoffer
Mor Shpigel Nacson
Suriya Gunasekar
Nathan Srebro
178
924
0
27 Oct 2017
SGD Learns Over-parameterized Networks that Provably Generalize on
  Linearly Separable Data
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
156
279
0
27 Oct 2017
Gradient Methods for Submodular Maximization
Gradient Methods for Submodular Maximization
Hamed Hassani
Mahdi Soltanolkotabi
Amin Karbasi
58
133
0
13 Aug 2017
Theoretical insights into the optimization landscape of
  over-parameterized shallow neural networks
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
Jason D. Lee
187
423
0
16 Jul 2017
Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees
Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees
Yan Shuo Tan
Roman Vershynin
78
102
0
30 Jun 2017
Spectrally-normalized margin bounds for neural networks
Spectrally-normalized margin bounds for neural networks
Peter L. Bartlett
Dylan J. Foster
Matus Telgarsky
ODL
218
1,225
0
26 Jun 2017
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Empirical Analysis of the Hessian of Over-Parametrized Neural Networks
Levent Sagun
Utku Evci
V. U. Güney
Yann N. Dauphin
Léon Bottou
74
419
0
14 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
183
337
0
10 Jun 2017
Implicit Regularization in Matrix Factorization
Implicit Regularization in Matrix Factorization
Suriya Gunasekar
Blake E. Woodworth
Srinadh Bhojanapalli
Behnam Neyshabur
Nathan Srebro
81
493
0
25 May 2017
The Marginal Value of Adaptive Gradient Methods in Machine Learning
The Marginal Value of Adaptive Gradient Methods in Machine Learning
Ashia Wilson
Rebecca Roelofs
Mitchell Stern
Nathan Srebro
Benjamin Recht
ODL
91
1,034
0
23 May 2017
Learning ReLUs via Gradient Descent
Learning ReLUs via Gradient Descent
Mahdi Soltanolkotabi
MLT
96
183
0
10 May 2017
Geometry of Optimization and Implicit Regularization in Deep Learning
Geometry of Optimization and Implicit Regularization in Deep Learning
Behnam Neyshabur
Ryota Tomioka
Ruslan Salakhutdinov
Nathan Srebro
AI4CE
79
134
0
08 May 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Alon Brutzkus
Amir Globerson
MLT
176
313
0
26 Feb 2017
Structured signal recovery from quadratic measurements: Breaking sample
  complexity barriers via nonconvex optimization
Structured signal recovery from quadratic measurements: Breaking sample complexity barriers via nonconvex optimization
Mahdi Soltanolkotabi
68
102
0
20 Feb 2017
Understanding deep learning requires rethinking generalization
Understanding deep learning requires rethinking generalization
Chiyuan Zhang
Samy Bengio
Moritz Hardt
Benjamin Recht
Oriol Vinyals
HAI
356
4,636
0
10 Nov 2016
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Entropy-SGD: Biasing Gradient Descent Into Wide Valleys
Pratik Chaudhari
A. Choromańska
Stefano Soatto
Yann LeCun
Carlo Baldassi
C. Borgs
J. Chayes
Levent Sagun
R. Zecchina
ODL
98
775
0
06 Nov 2016
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp
  Minima
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
442
2,946
0
15 Sep 2016
12
Next