Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03804
Cited By
v1
v2
v3
v4 (latest)
Gradient Descent Finds Global Minima of Deep Neural Networks
9 November 2018
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Finds Global Minima of Deep Neural Networks"
50 / 466 papers shown
Title
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
François Caron
Fadhel Ayed
Paul Jung
Hoileong Lee
Juho Lee
Hongseok Yang
129
2
0
02 Feb 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients
Guihong Li
Yuedong Yang
Kartikeya Bhardwaj
R. Marculescu
117
63
0
26 Jan 2023
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
60
3
0
19 Jan 2023
Stretched and measured neural predictions of complex network dynamics
V. Vasiliauskaite
Nino Antulov-Fantulin
67
1
0
12 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
121
13
0
30 Dec 2022
Enhancing Neural Network Differential Equation Solvers
Matthew J. H. Wright
44
1
0
28 Dec 2022
Rank-1 Matrix Completion with Gradient Descent and Small Random Initialization
Daesung Kim
Hye Won Chung
86
2
0
19 Dec 2022
Reconstructing Training Data from Model Gradient, Provably
Zihan Wang
Jason D. Lee
Qi Lei
FedML
116
26
0
07 Dec 2022
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems
Zi Xu
Ziqi Wang
Junlin Wang
Y. Dai
103
11
0
24 Nov 2022
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
118
49
0
15 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
116
18
0
11 Nov 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
52
6
0
08 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
75
5
0
28 Oct 2022
Efficient and Light-Weight Federated Learning via Asynchronous Distributed Dropout
Chen Dun
Mirian Hipolito Garcia
C. Jermaine
Dimitrios Dimitriadis
Anastasios Kyrillidis
136
22
0
28 Oct 2022
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery
Tyler Maunu
Thibaut Le Gouic
Philippe Rigollet
64
5
0
26 Oct 2022
GCT: Gated Contextual Transformer for Sequential Audio Tagging
Yuanbo Hou
Yun Wang
Wenwu Wang
Dick Botteldooren
60
0
0
22 Oct 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
124
10
0
21 Oct 2022
Few-shot Backdoor Attacks via Neural Tangent Kernels
J. Hayase
Sewoong Oh
72
21
0
12 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
157
69
0
11 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
98
20
0
11 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Nonlinear Sufficient Dimension Reduction with a Stochastic Neural Network
Siqi Liang
Y. Sun
F. Liang
BDL
71
11
0
09 Oct 2022
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis
Jiancong Xiao
Zeyu Qin
Yanbo Fan
Baoyuan Wu
Jue Wang
Zhimin Luo
AAML
124
7
0
02 Oct 2022
Improved Algorithms for Neural Active Learning
Yikun Ban
Yuheng Zhang
Hanghang Tong
A. Banerjee
Jingrui He
AI4TS
61
12
0
02 Oct 2022
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee
Pedro Cisneros-Velarde
Libin Zhu
M. Belkin
73
8
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
137
6
0
27 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
102
7
0
17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
104
21
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
AI4CE
90
17
0
15 Sep 2022
Visualizing high-dimensional loss landscapes with Hessian directions
Lucas Böttcher
Gregory R. Wheeler
79
14
0
28 Aug 2022
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
77
25
0
10 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
76
1
0
08 Aug 2022
Feature selection with gradient descent on two-layer networks in low-rotation regimes
Matus Telgarsky
MLT
81
16
0
04 Aug 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
103
5
0
03 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
97
47
0
26 Jul 2022
Can we achieve robustness from data alone?
Nikolaos Tsilivis
Jingtong Su
Julia Kempe
OOD
DD
108
18
0
24 Jul 2022
Deep Sequence Models for Text Classification Tasks
S. S. Abdullahi
Su Yiming
Shamsuddeen Hassan Muhammad
A. Mustapha
Ahmad Muhammad Aminu
Abdulkadir Abdullahi
Musa Bello
Saminu Mohammad Aliyu
53
3
0
18 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh Chawla
66
7
0
13 Jul 2022
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Lechao Xiao
Jeffrey Pennington
101
10
0
11 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
104
29
0
08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
78
3
0
02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
97
15
0
26 Jun 2022
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas
Yamini Bansal
Preetum Nakkiran
116
34
0
20 Jun 2022
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
110
19
0
15 Jun 2022
From Perception to Programs: Regularize, Overparameterize, and Amortize
Hao Tang
Kevin Ellis
NAI
82
10
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
63
2
0
13 Jun 2022
What is a Good Metric to Study Generalization of Minimax Learners?
Asuman Ozdaglar
S. Pattathil
Jiawei Zhang
Kai Zhang
66
14
0
09 Jun 2022
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks
Huishuai Zhang
Da Yu
Yiping Lu
Di He
AAML
98
1
0
09 Jun 2022
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime
Benjamin Bowman
Guido Montúfar
82
15
0
06 Jun 2022
The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
104
39
0
06 Jun 2022
Previous
1
2
3
4
5
...
8
9
10
Next