Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03804
Cited By
Gradient Descent Finds Global Minima of Deep Neural Networks
9 November 2018
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Descent Finds Global Minima of Deep Neural Networks"
50 / 763 papers shown
Title
How many Neurons do we need? A refined Analysis for Shallow Networks trained with Gradient Descent
Mike Nguyen
Nicole Mücke
MLT
19
5
0
14 Sep 2023
Optimization Guarantees of Unfolded ISTA and ADMM Networks With Smooth Soft-Thresholding
Shaik Basheeruddin Shah
Pradyumna Pradhan
Wei Pu
Ramunaidu Randhi
Miguel R. D. Rodrigues
Yonina C. Eldar
22
4
0
12 Sep 2023
Generalization error bounds for iterative learning algorithms with bounded updates
Jingwen Fu
Nanning Zheng
47
1
0
10 Sep 2023
Approximation Results for Gradient Descent trained Neural Networks
G. Welper
48
0
0
09 Sep 2023
Optimal Rate of Kernel Regression in Large Dimensions
Weihao Lu
Hao Zhang
Yicheng Li
Manyun Xu
Qian Lin
42
5
0
08 Sep 2023
No Data Augmentation? Alternative Regularizations for Effective Training on Small Datasets
Lorenzo Brigato
S. Mougiakakou
27
3
0
04 Sep 2023
On the training and generalization of deep operator networks
Sanghyun Lee
Yeonjong Shin
13
19
0
02 Sep 2023
Multilayer Multiset Neuronal Networks -- MMNNs
Alexandre Benatti
L. D. F. Costa
14
1
0
28 Aug 2023
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
39
12
0
25 Aug 2023
Expressive probabilistic sampling in recurrent neural networks
Shirui Chen
Linxing Jiang
Rajesh P. N. Rao
E. Shea-Brown
DiffM
11
2
0
22 Aug 2023
Equitable Time-Varying Pricing Tariff Design: A Joint Learning and Optimization Approach
Liudong Chen
Bolun Xu
15
0
0
26 Jul 2023
Understanding Deep Neural Networks via Linear Separability of Hidden Layers
Chao Zhang
Xinyuan Chen
Wensheng Li
Lixue Liu
Wei Wu
Dacheng Tao
28
3
0
26 Jul 2023
What can a Single Attention Layer Learn? A Study Through the Random Features Lens
Hengyu Fu
Tianyu Guo
Yu Bai
Song Mei
MLT
32
22
0
21 Jul 2023
Provable Multi-Task Representation Learning by Two-Layer ReLU Neural Networks
Liam Collins
Hamed Hassani
Mahdi Soltanolkotabi
Aryan Mokhtari
Sanjay Shakkottai
39
10
0
13 Jul 2023
Efficient SGD Neural Network Training via Sublinear Activated Neuron Identification
Lianke Qin
Zhao-quan Song
Yuanyuan Yang
25
9
0
13 Jul 2023
Fundamental limits of overparametrized shallow neural networks for supervised learning
Francesco Camilli
D. Tieplova
Jean Barbier
35
9
0
11 Jul 2023
Neural Hilbert Ladders: Multi-Layer Neural Networks in Function Space
Zhengdao Chen
41
1
0
03 Jul 2023
Graph Neural Networks Provably Benefit from Structural Information: A Feature Learning Perspective
Wei Huang
Yuanbin Cao
Hong Wang
Xin Cao
Taiji Suzuki
MLT
37
7
0
24 Jun 2023
Max-Margin Token Selection in Attention Mechanism
Davoud Ataee Tarzanagh
Yingcong Li
Xuechen Zhang
Samet Oymak
37
38
0
23 Jun 2023
Gradient is All You Need?
Konstantin Riedl
T. Klock
Carina Geldhauser
M. Fornasier
24
6
0
16 Jun 2023
Batches Stabilize the Minimum Norm Risk in High Dimensional Overparameterized Linear Regression
Shahar Stein Ioushua
Inbar Hasidim
O. Shayevitz
M. Feder
19
0
0
14 Jun 2023
Nonparametric regression using over-parameterized shallow ReLU neural networks
Yunfei Yang
Ding-Xuan Zhou
26
6
0
14 Jun 2023
Efficient Uncertainty Quantification and Reduction for Over-Parameterized Neural Networks
Ziyi Huang
H. Lam
Haofeng Zhang
UQCV
23
4
0
09 Jun 2023
Catapults in SGD: spikes in the training loss and their impact on generalization through feature learning
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
30
13
0
07 Jun 2023
Patch-level Routing in Mixture-of-Experts is Provably Sample-efficient for Convolutional Neural Networks
Mohammed Nowaz Rabbani Chowdhury
Shuai Zhang
Hao Wu
Sijia Liu
Pin-Yu Chen
MoE
29
17
0
07 Jun 2023
Continual Learning in Linear Classification on Separable Data
Itay Evron
E. Moroshko
G. Buzaglo
M. Khriesh
B. Marjieh
Nathan Srebro
Daniel Soudry
CLL
34
14
0
06 Jun 2023
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Xiangyi Chen
Zhao-quan Song
Baochen Sun
Junze Yin
Danyang Zhuo
36
3
0
06 Jun 2023
Aiming towards the minimizers: fast convergence of SGD for overparametrized problems
Chaoyue Liu
Dmitriy Drusvyatskiy
M. Belkin
Damek Davis
Yi Ma
ODL
22
16
0
05 Jun 2023
Towards Understanding Clean Generalization and Robust Overfitting in Adversarial Training
Binghui Li
Yuanzhi Li
AAML
26
3
0
02 Jun 2023
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Emanuele Francazi
Aurelien Lucchi
Marco Baity-Jesi
AI4CE
28
3
0
01 Jun 2023
Understanding Augmentation-based Self-Supervised Representation Learning via RKHS Approximation and Regression
Runtian Zhai
Bing Liu
Andrej Risteski
Zico Kolter
Pradeep Ravikumar
SSL
28
9
0
01 Jun 2023
Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Francesco Locatello
V. Cevher
AI4CE
23
10
0
30 May 2023
Generalization Ability of Wide Residual Networks
Jianfa Lai
Zixiong Yu
Songtao Tian
Qian Lin
31
4
0
29 May 2023
Generalization Guarantees of Gradient Descent for Multi-Layer Neural Networks
Puyu Wang
Yunwen Lei
Di Wang
Yiming Ying
Ding-Xuan Zhou
MLT
27
3
0
26 May 2023
An Analytic End-to-End Deep Learning Algorithm based on Collaborative Learning
Sitan Li
C. Cheah
6
1
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
26
70
0
25 May 2023
Test like you Train in Implicit Deep Learning
Zaccharie Ramzi
Pierre Ablin
Gabriel Peyré
Thomas Moreau
36
3
0
24 May 2023
On the Generalization of Diffusion Model
Mingyang Yi
Jiacheng Sun
Zhenguo Li
22
18
0
24 May 2023
On progressive sharpening, flat minima and generalisation
L. MacDonald
Jack Valmadre
Simon Lucey
27
4
0
24 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
16
3
0
22 May 2023
Tight conditions for when the NTK approximation is valid
Enric Boix-Adserà
Etai Littwin
30
0
0
22 May 2023
Gradient Descent Monotonically Decreases the Sharpness of Gradient Flow Solutions in Scalar Networks and Beyond
Itai Kreisler
Mor Shpigel Nacson
Daniel Soudry
Y. Carmon
30
13
0
22 May 2023
Loss Spike in Training Neural Networks
Zhongwang Zhang
Z. Xu
33
4
0
20 May 2023
Mode Connectivity in Auction Design
Christoph Hertrich
Yixin Tao
László A. Végh
16
1
0
18 May 2023
ReLU soothes the NTK condition number and accelerates optimization for wide neural networks
Chaoyue Liu
Like Hui
MLT
27
9
0
15 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao-quan Song
Mingquan Ye
22
4
0
13 May 2023
Robust Implicit Regularization via Weight Normalization
H. Chou
Holger Rauhut
Rachel A. Ward
30
7
0
09 May 2023
Neural Exploitation and Exploration of Contextual Bandits
Yikun Ban
Yuchen Yan
A. Banerjee
Jingrui He
42
8
0
05 May 2023
On the Eigenvalue Decay Rates of a Class of Neural-Network Related Kernel Functions Defined on General Domains
Yicheng Li
Zixiong Yu
Y. Cotronis
Qian Lin
55
13
0
04 May 2023
MISNN: Multiple Imputation via Semi-parametric Neural Networks
Zhiqi Bu
Zongyu Dai
Yiliang Zhang
Q. Long
28
0
0
02 May 2023
Previous
1
2
3
4
5
6
...
14
15
16
Next