ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
No Data Augmentation? Alternative Regularizations for Effective Training
  on Small Datasets
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
Stavroula Mougiakakou
79
5
0
04 Sep 2023
On the training and generalization of deep operator networks
On the training and generalization of deep operator networks
Sanghyun Lee
Yeonjong Shin
64
22
0
02 Sep 2023
Robust Point Cloud Processing through Positional Embedding
Robust Point Cloud Processing through Positional Embedding
Jianqiao Zheng
Xueqian Li
Sameera Ramasinghe
Simon Lucey
3DPC
93
5
0
01 Sep 2023
Transformers as Support Vector Machines
Transformers as Support Vector Machines
Davoud Ataee Tarzanagh
Yingcong Li
Christos Thrampoulidis
Samet Oymak
133
49
0
31 Aug 2023
Six Lectures on Linearized Neural Networks
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
143
13
0
25 Aug 2023
How to Protect Copyright Data in Optimization of Large Language Models?
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao Song
Chiwun Yang
85
31
0
23 Aug 2023
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Xiaoge Deng
Li Shen
Shengwei Li
Tao Sun
Dongsheng Li
Dacheng Tao
85
3
0
18 Aug 2023
Convergence of Two-Layer Regression with Nonlinear Units
Convergence of Two-Layer Regression with Nonlinear Units
Yichuan Deng
Zhao Song
Shenghao Xie
80
7
0
16 Aug 2023
Memory capacity of two layer neural networks with smooth activations
Memory capacity of two layer neural networks with smooth activations
Liam Madden
Christos Thrampoulidis
MLT
55
5
0
03 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden
  Layers
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
53
3
0
26 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random
  Features Lens
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
108
26
0
21 Jul 2023
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
Chia-Hsiang Kao
Yu-Chiang Frank Wang
FedML
110
1
0
19 Jul 2023
Discovering a reaction-diffusion model for Alzheimer's disease by
  combining PINNs with symbolic regression
Discovering a reaction-diffusion model for Alzheimer's disease by combining PINNs with symbolic regression
Zhen Zhang
Zongren Zou
E. Kuhl
George Karniadakis
68
43
0
16 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron
  Identification
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
59
9
0
13 Jul 2023
Quantitative CLTs in Deep Neural Networks
Quantitative CLTs in Deep Neural Networks
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
114
16
0
12 Jul 2023
Fundamental limits of overparametrized shallow neural networks for
  supervised learning
Fundamental limits of overparametrized shallow neural networks for supervised learning
Francesco Camilli
D. Tieplova
Jean Barbier
69
10
0
11 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
100
1
0
03 Jul 2023
A Unified Approach to Controlling Implicit Regularization via Mirror
  Descent
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Haoyuan Sun
Khashayar Gatmiry
Kwangjun Ahn
Navid Azizan
AI4CE
74
13
0
24 Jun 2023
Scaling MLPs: A Tale of Inductive Bias
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
110
39
0
23 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High
  Dimensions
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
122
4
0
17 Jun 2023
Batches Stabilize the Minimum Norm Risk in High Dimensional
  Overparameterized Linear Regression
Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression
Shahar Stein Ioushua
Inbar Hasidim
O. Shayevitz
M. Feder
61
0
0
14 Jun 2023
On Achieving Optimal Adversarial Test Error
On Achieving Optimal Adversarial Test Error
Justin D. Li
Matus Telgarsky
AAML
64
2
0
13 Jun 2023
Learning Unnormalized Statistical Models via Compositional Optimization
Learning Unnormalized Statistical Models via Compositional Optimization
Wei Jiang
Jiayu Qin
Lingyu Wu
Changyou Chen
Tianbao Yang
Lijun Zhang
103
4
0
13 Jun 2023
A Theory of Unsupervised Speech Recognition
A Theory of Unsupervised Speech Recognition
Liming Wang
M. Hasegawa-Johnson
Chang D. Yoo
SSL
57
2
0
09 Jun 2023
Efficient Uncertainty Quantification and Reduction for
  Over-Parameterized Neural Networks
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Ziyi Huang
Henry Lam
Haofeng Zhang
UQCV
84
7
0
09 Jun 2023
Catapults in SGD: spikes in the training loss and their impact on
  generalization through feature learning
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
124
15
0
07 Jun 2023
Query Complexity of Active Learning for Function Family With Nearly
  Orthogonal Basis
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Xiangyi Chen
Zhao Song
Baochen Sun
Junze Yin
Danyang Zhuo
88
3
0
06 Jun 2023
Aiming towards the minimizers: fast convergence of SGD for
  overparametrized problems
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Chaoyue Liu
Dmitriy Drusvyatskiy
M. Belkin
Damek Davis
Yi-An Ma
ODL
77
18
0
05 Jun 2023
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Emanuele Francazi
Aurelien Lucchi
Marco Baity-Jesi
AI4CE
75
4
0
01 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning
  via RKHS Approximation and Regression
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
117
10
0
01 Jun 2023
Combinatorial Neural Bandits
Combinatorial Neural Bandits
Taehyun Hwang
Kyuwook Chai
Min Hwan Oh
52
6
0
31 May 2023
Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric
  Regularization Approach
Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach
Tri Nguyen
Shahana Ibrahim
Xiao Fu
54
6
0
30 May 2023
Benign Overfitting in Deep Neural Networks under Lazy Training
Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Francesco Locatello
Volkan Cevher
AI4CE
71
10
0
30 May 2023
Matrix Information Theory for Self-Supervised Learning
Matrix Information Theory for Self-Supervised Learning
Yifan Zhang
Zhi-Hao Tan
Jingqin Yang
Weiran Huang
Yang Yuan
SSL
119
19
0
27 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural
  Networks
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
65
4
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
109
79
0
25 May 2023
On progressive sharpening, flat minima and generalisation
On progressive sharpening, flat minima and generalisation
L. MacDonald
Jack Valmadre
Simon Lucey
80
4
0
24 May 2023
Tight conditions for when the NTK approximation is valid
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
99
0
0
22 May 2023
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree
  Spectral Bias of Neural Networks
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree Spectral Bias of Neural Networks
Ali Gorji
Andisheh Amrollahi
A. Krause
50
4
0
16 May 2023
Deep ReLU Networks Have Surprisingly Simple Polytopes
Deep ReLU Networks Have Surprisingly Simple Polytopes
Fenglei Fan
Wei Huang
Xiang-yu Zhong
Lecheng Ruan
T. Zeng
Huan Xiong
Fei Wang
112
5
0
16 May 2023
ReLU soothes the NTK condition number and accelerates optimization for
  wide neural networks
ReLU soothes the NTK condition number and accelerates optimization for wide neural networks
Chaoyue Liu
Like Hui
MLT
127
9
0
15 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured
  Data
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao Song
Mingquan Ye
72
4
0
13 May 2023
Depth Dependence of $μ$P Learning Rates in ReLU MLPs
Depth Dependence of μμμP Learning Rates in ReLU MLPs
Samy Jelassi
Boris Hanin
Ziwei Ji
Sashank J. Reddi
Srinadh Bhojanapalli
Surinder Kumar
63
6
0
13 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
201
15
0
11 May 2023
Random Smoothing Regularization in Kernel Gradient Descent Learning
Random Smoothing Regularization in Kernel Gradient Descent Learning
Liang Ding
Tianyang Hu
Jiahan Jiang
Donghao Li
Wei Cao
Yuan Yao
72
6
0
05 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related
  Kernel Functions Defined on General Domains
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
112
16
0
04 May 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks
Expand-and-Cluster: Parameter Recovery of Neural Networks
Flavio Martinelli
Berfin Simsek
W. Gerstner
Johanni Brea
146
8
0
25 Apr 2023
DiffFit: Unlocking Transferability of Large Diffusion Models via Simple
  Parameter-Efficient Fine-Tuning
DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning
Enze Xie
Lewei Yao
Han Shi
Zhili Liu
Daquan Zhou
Zhaoqiang Liu
Jiawei Li
Zhenguo Li
74
81
0
13 Apr 2023
Understanding Overfitting in Adversarial Training via Kernel Regression
Understanding Overfitting in Adversarial Training via Kernel Regression
Teng Zhang
Kang Li
63
2
0
13 Apr 2023
Physics-informed radial basis network (PIRBN): A local approximating
  neural network for solving nonlinear PDEs
Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear PDEs
Jinshuai Bai
Guirong Liu
Ashish Gupta
Laith Alzubaidi
Xinzhu Feng
Yuantong T. Gu
PINN
60
1
0
13 Apr 2023
Previous
12345...161718
Next