Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.03804
Cited By
Gradient Descent Finds Global Minima of Deep Neural Networks
9 November 2018
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Descent Finds Global Minima of Deep Neural Networks"
50 / 763 papers shown
Title
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
68
60
0
11 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
44
17
0
11 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
19
1
0
10 Oct 2022
Nonlinear Sufficient Dimension Reduction with a Stochastic Neural Network
Siqi Liang
Y. Sun
F. Liang
BDL
34
8
0
09 Oct 2022
Stability Analysis and Generalization Bounds of Adversarial Training
Jiancong Xiao
Yanbo Fan
Ruoyu Sun
Jue Wang
Zhimin Luo
AAML
32
30
0
03 Oct 2022
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis
Jiancong Xiao
Zeyu Qin
Yanbo Fan
Baoyuan Wu
Jue Wang
Zhimin Luo
AAML
31
7
0
02 Oct 2022
Improved Algorithms for Neural Active Learning
Yikun Ban
Yuheng Zhang
Hanghang Tong
A. Banerjee
Jingrui He
AI4TS
36
11
0
02 Oct 2022
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee
Pedro Cisneros-Velarde
Libin Zhu
M. Belkin
26
7
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
96
6
0
27 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
28
5
0
19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural Networks in
1
d
1d
1
d
R. Gentile
G. Welper
ODL
52
6
0
17 Sep 2022
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study
Yongtao Wu
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
28
10
0
16 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
39
19
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
V. Cevher
AI4CE
28
15
0
15 Sep 2022
Robust Constrained Reinforcement Learning
Yue Wang
Fei Miao
Shaofeng Zou
37
12
0
14 Sep 2022
Neural Tangent Kernel: A Survey
Eugene Golikov
Eduard Pokonechnyy
Vladimir Korviakov
27
12
0
29 Aug 2022
Visualizing high-dimensional loss landscapes with Hessian directions
Lucas Böttcher
Gregory R. Wheeler
37
12
0
28 Aug 2022
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao-quan Song
Yitan Wang
GAN
31
25
0
10 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
30
1
0
08 Aug 2022
Federated Adversarial Learning: A Framework with Convergence Analysis
Xiaoxiao Li
Zhao-quan Song
Jiaming Yang
FedML
27
19
0
07 Aug 2022
Feature selection with gradient descent on two-layer networks in low-rotation regimes
Matus Telgarsky
MLT
28
16
0
04 Aug 2022
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
28
5
0
03 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
19
42
0
26 Jul 2022
Can we achieve robustness from data alone?
Nikolaos Tsilivis
Jingtong Su
Julia Kempe
OOD
DD
36
18
0
24 Jul 2022
Deep Sequence Models for Text Classification Tasks
S. S. Abdullahi
Su Yiming
Shamsuddeen Hassan Muhammad
A. Mustapha
Ahmad Muhammad Aminu
Abdulkadir Abdullahi
Musa Bello
Saminu Mohammad Aliyu
22
3
0
18 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh V. Chawla
30
7
0
13 Jul 2022
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Lechao Xiao
Jeffrey Pennington
34
10
0
11 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
37
27
0
08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
32
3
0
02 Jul 2022
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSL
MLT
19
114
0
30 Jun 2022
A note on Linear Bottleneck networks and their Transition to Multilinearity
Libin Zhu
Parthe Pandit
M. Belkin
MLT
70
0
0
30 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao-quan Song
David P. Woodruff
30
15
0
26 Jun 2022
AdAUC: End-to-end Adversarial AUC Optimization Against Long-tail Problems
Wen-ming Hou
Qianqian Xu
Zhiyong Yang
Shilong Bao
Yuan He
Qingming Huang
AAML
26
5
0
24 Jun 2022
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas
Yamini Bansal
Preetum Nakkiran
30
32
0
20 Jun 2022
Wide Bayesian neural networks have a simple weight posterior: theory and accelerated sampling
Jiri Hron
Roman Novak
Jeffrey Pennington
Jascha Narain Sohl-Dickstein
UQCV
BDL
48
6
0
15 Jun 2022
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
26
17
0
15 Jun 2022
From Perception to Programs: Regularize, Overparameterize, and Amortize
Hao Tang
Kevin Ellis
NAI
22
10
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
32
2
0
13 Jun 2022
Parameter Convex Neural Networks
Jingcheng Zhou
Wei Wei
Xing Li
Bowen Pang
Zhiming Zheng
6
0
0
11 Jun 2022
What is a Good Metric to Study Generalization of Minimax Learners?
Asuman Ozdaglar
S. Pattathil
Jiawei Zhang
Kaipeng Zhang
21
13
0
09 Jun 2022
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks
Huishuai Zhang
Da Yu
Yiping Lu
Di He
AAML
27
1
0
09 Jun 2022
Wavelet Regularization Benefits Adversarial Training
Jun Yan
Huilin Yin
Xiaoyang Deng
Zi-qin Zhao
Wancheng Ge
Hao Zhang
Gerhard Rigoll
AAML
19
2
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
27
10
0
08 Jun 2022
Neural Bandit with Arm Group Graph
Yunzhe Qi
Yikun Ban
Jingrui He
25
10
0
08 Jun 2022
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime
Benjamin Bowman
Guido Montúfar
22
14
0
06 Jun 2022
The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization
Mufan Bill Li
Mihai Nica
Daniel M. Roy
35
36
0
06 Jun 2022
Provably and Practically Efficient Neural Contextual Bandits
Sudeep Salgia
Sattar Vakili
Qing Zhao
22
8
0
31 May 2022
Provable General Function Class Representation Learning in Multitask Bandits and MDPs
Rui Lu
Andrew Zhao
S. Du
Gao Huang
OffRL
35
10
0
31 May 2022
Non-convex online learning via algorithmic equivalence
Udaya Ghai
Zhou Lu
Elad Hazan
14
8
0
30 May 2022
Excess Risk of Two-Layer ReLU Neural Networks in Teacher-Student Settings and its Superiority to Kernel Methods
Shunta Akiyama
Taiji Suzuki
30
6
0
30 May 2022
Previous
1
2
3
4
5
6
...
14
15
16
Next