Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02054
Cited By
v1
v2 (latest)
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Provably Optimizes Over-parameterized Neural Networks"
50 / 882 papers shown
Title
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
Stavroula Mougiakakou
79
5
0
04 Sep 2023
On the training and generalization of deep operator networks
Sanghyun Lee
Yeonjong Shin
64
22
0
02 Sep 2023
Robust Point Cloud Processing through Positional Embedding
Jianqiao Zheng
Xueqian Li
Sameera Ramasinghe
Simon Lucey
3DPC
93
5
0
01 Sep 2023
Transformers as Support Vector Machines
Davoud Ataee Tarzanagh
Yingcong Li
Christos Thrampoulidis
Samet Oymak
133
49
0
31 Aug 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
143
13
0
25 Aug 2023
How to Protect Copyright Data in Optimization of Large Language Models?
T. Chu
Zhao Song
Chiwun Yang
85
31
0
23 Aug 2023
Towards Understanding the Generalizability of Delayed Stochastic Gradient Descent
Xiaoge Deng
Li Shen
Shengwei Li
Tao Sun
Dongsheng Li
Dacheng Tao
85
3
0
18 Aug 2023
Convergence of Two-Layer Regression with Nonlinear Units
Yichuan Deng
Zhao Song
Shenghao Xie
80
7
0
16 Aug 2023
Memory capacity of two layer neural networks with smooth activations
Liam Madden
Christos Thrampoulidis
MLT
55
5
0
03 Aug 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
53
3
0
26 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
108
26
0
21 Jul 2023
FedBug: A Bottom-Up Gradual Unfreezing Framework for Federated Learning
Chia-Hsiang Kao
Yu-Chiang Frank Wang
FedML
110
1
0
19 Jul 2023
Discovering a reaction-diffusion model for Alzheimer's disease by combining PINNs with symbolic regression
Zhen Zhang
Zongren Zou
E. Kuhl
George Karniadakis
68
43
0
16 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao Song
Yuanyuan Yang
59
9
0
13 Jul 2023
Quantitative CLTs in Deep Neural Networks
Stefano Favaro
Boris Hanin
Domenico Marinucci
I. Nourdin
G. Peccati
BDL
114
16
0
12 Jul 2023
Fundamental limits of overparametrized shallow neural networks for supervised learning
Francesco Camilli
D. Tieplova
Jean Barbier
69
10
0
11 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
100
1
0
03 Jul 2023
A Unified Approach to Controlling Implicit Regularization via Mirror Descent
Haoyuan Sun
Khashayar Gatmiry
Kwangjun Ahn
Navid Azizan
AI4CE
74
13
0
24 Jun 2023
Scaling MLPs: A Tale of Inductive Bias
Gregor Bachmann
Sotiris Anagnostidis
Thomas Hofmann
110
39
0
23 Jun 2023
The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions
Nishil Patel
Sebastian Lee
Stefano Sarao Mannelli
Sebastian Goldt
Adrew Saxe
OffRL
122
4
0
17 Jun 2023
Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression
Shahar Stein Ioushua
Inbar Hasidim
O. Shayevitz
M. Feder
61
0
0
14 Jun 2023
On Achieving Optimal Adversarial Test Error
Justin D. Li
Matus Telgarsky
AAML
64
2
0
13 Jun 2023
Learning Unnormalized Statistical Models via Compositional Optimization
Wei Jiang
Jiayu Qin
Lingyu Wu
Changyou Chen
Tianbao Yang
Lijun Zhang
103
4
0
13 Jun 2023
A Theory of Unsupervised Speech Recognition
Liming Wang
M. Hasegawa-Johnson
Chang D. Yoo
SSL
57
2
0
09 Jun 2023
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Ziyi Huang
Henry Lam
Haofeng Zhang
UQCV
84
7
0
09 Jun 2023
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
124
15
0
07 Jun 2023
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Xiangyi Chen
Zhao Song
Baochen Sun
Junze Yin
Danyang Zhuo
88
3
0
06 Jun 2023
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Chaoyue Liu
Dmitriy Drusvyatskiy
M. Belkin
Damek Davis
Yi-An Ma
ODL
77
18
0
05 Jun 2023
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Emanuele Francazi
Aurelien Lucchi
Marco Baity-Jesi
AI4CE
75
4
0
01 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
117
10
0
01 Jun 2023
Combinatorial Neural Bandits
Taehyun Hwang
Kyuwook Chai
Min Hwan Oh
52
6
0
31 May 2023
Deep Clustering with Incomplete Noisy Pairwise Annotations: A Geometric Regularization Approach
Tri Nguyen
Shahana Ibrahim
Xiao Fu
54
6
0
30 May 2023
Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Francesco Locatello
Volkan Cevher
AI4CE
71
10
0
30 May 2023
Matrix Information Theory for Self-Supervised Learning
Yifan Zhang
Zhi-Hao Tan
Jingqin Yang
Weiran Huang
Yang Yuan
SSL
119
19
0
27 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
65
4
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
109
79
0
25 May 2023
On progressive sharpening, flat minima and generalisation
L. MacDonald
Jack Valmadre
Simon Lucey
80
4
0
24 May 2023
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
99
0
0
22 May 2023
A Scalable Walsh-Hadamard Regularizer to Overcome the Low-degree Spectral Bias of Neural Networks
Ali Gorji
Andisheh Amrollahi
A. Krause
50
4
0
16 May 2023
Deep ReLU Networks Have Surprisingly Simple Polytopes
Fenglei Fan
Wei Huang
Xiang-yu Zhong
Lecheng Ruan
T. Zeng
Huan Xiong
Fei Wang
112
5
0
16 May 2023
ReLU soothes the NTK condition number and accelerates optimization for wide neural networks
Chaoyue Liu
Like Hui
MLT
127
9
0
15 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao Song
Mingquan Ye
72
4
0
13 May 2023
Depth Dependence of
μ
μ
μ
P Learning Rates in ReLU MLPs
Samy Jelassi
Boris Hanin
Ziwei Ji
Sashank J. Reddi
Srinadh Bhojanapalli
Surinder Kumar
63
6
0
13 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
201
15
0
11 May 2023
Random Smoothing Regularization in Kernel Gradient Descent Learning
Liang Ding
Tianyang Hu
Jiahan Jiang
Donghao Li
Wei Cao
Yuan Yao
72
6
0
05 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
112
16
0
04 May 2023
Expand-and-Cluster: Parameter Recovery of Neural Networks
Flavio Martinelli
Berfin Simsek
W. Gerstner
Johanni Brea
146
8
0
25 Apr 2023
DiffFit: Unlocking Transferability of Large Diffusion Models via Simple Parameter-Efficient Fine-Tuning
Enze Xie
Lewei Yao
Han Shi
Zhili Liu
Daquan Zhou
Zhaoqiang Liu
Jiawei Li
Zhenguo Li
74
81
0
13 Apr 2023
Understanding Overfitting in Adversarial Training via Kernel Regression
Teng Zhang
Kang Li
63
2
0
13 Apr 2023
Physics-informed radial basis network (PIRBN): A local approximating neural network for solving nonlinear PDEs
Jinshuai Bai
Guirong Liu
Ashish Gupta
Laith Alzubaidi
Xinzhu Feng
Yuantong T. Gu
PINN
60
1
0
13 Apr 2023
Previous
1
2
3
4
5
...
16
17
18
Next