ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03804
  4. Cited By
Gradient Descent Finds Global Minima of Deep Neural Networks
v1v2v3v4 (latest)

Gradient Descent Finds Global Minima of Deep Neural Networks

9 November 2018
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
    ODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Finds Global Minima of Deep Neural Networks"

50 / 466 papers shown
Title
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
Over-parameterised Shallow Neural Networks with Asymmetrical Node Scaling: Global Convergence Guarantees and Feature Learning
François Caron
Fadhel Ayed
Paul Jung
Hoileong Lee
Juho Lee
Hongseok Yang
129
2
0
02 Feb 2023
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients
ZiCo: Zero-shot NAS via Inverse Coefficient of Variation on Gradients
Guihong Li
Yuedong Yang
Kartikeya Bhardwaj
R. Marculescu
117
63
0
26 Jan 2023
Convergence beyond the over-parameterized regime using Rayleigh
  quotients
Convergence beyond the over-parameterized regime using Rayleigh quotients
David A. R. Robin
Kevin Scaman
Marc Lelarge
60
3
0
19 Jan 2023
Stretched and measured neural predictions of complex network dynamics
Stretched and measured neural predictions of complex network dynamics
V. Vasiliauskaite
Nino Antulov-Fantulin
67
1
0
12 Jan 2023
An Analysis of Attention via the Lens of Exchangeability and Latent
  Variable Models
An Analysis of Attention via the Lens of Exchangeability and Latent Variable Models
Yufeng Zhang
Boyi Liu
Qi Cai
Lingxiao Wang
Zhaoran Wang
121
13
0
30 Dec 2022
Enhancing Neural Network Differential Equation Solvers
Enhancing Neural Network Differential Equation Solvers
Matthew J. H. Wright
44
1
0
28 Dec 2022
Rank-1 Matrix Completion with Gradient Descent and Small Random Initialization
Rank-1 Matrix Completion with Gradient Descent and Small Random Initialization
Daesung Kim
Hye Won Chung
86
2
0
19 Dec 2022
Reconstructing Training Data from Model Gradient, Provably
Reconstructing Training Data from Model Gradient, Provably
Zihan Wang
Jason D. Lee
Qi Lei
FedML
116
26
0
07 Dec 2022
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class
  of Nonconvex-Nonconcave Minimax Problems
Zeroth-Order Alternating Gradient Descent Ascent Algorithms for a Class of Nonconvex-Nonconcave Minimax Problems
Zi Xu
Ziqi Wang
Junlin Wang
Y. Dai
103
11
0
24 Nov 2022
Mechanistic Mode Connectivity
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
118
49
0
15 Nov 2022
Spectral Evolution and Invariance in Linear-width Neural Networks
Spectral Evolution and Invariance in Linear-width Neural Networks
Zhichao Wang
A. Engel
Anand D. Sarwate
Ioana Dumitriu
Tony Chiang
116
18
0
11 Nov 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
52
6
0
08 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
  Neural Networks
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
75
5
0
28 Oct 2022
Efficient and Light-Weight Federated Learning via Asynchronous
  Distributed Dropout
Efficient and Light-Weight Federated Learning via Asynchronous Distributed Dropout
Chen Dun
Mirian Hipolito Garcia
C. Jermaine
Dimitrios Dimitriadis
Anastasios Kyrillidis
136
22
0
28 Oct 2022
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery
Bures-Wasserstein Barycenters and Low-Rank Matrix Recovery
Tyler Maunu
Thibaut Le Gouic
Philippe Rigollet
64
5
0
26 Oct 2022
GCT: Gated Contextual Transformer for Sequential Audio Tagging
GCT: Gated Contextual Transformer for Sequential Audio Tagging
Yuanbo Hou
Yun Wang
Wenwu Wang
Dick Botteldooren
60
0
0
22 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work
When Expressivity Meets Trainability: Fewer than nnn Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
124
10
0
21 Oct 2022
Few-shot Backdoor Attacks via Neural Tangent Kernels
Few-shot Backdoor Attacks via Neural Tangent Kernels
J. Hayase
Sewoong Oh
72
21
0
12 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
157
69
0
11 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
98
20
0
11 Oct 2022
On skip connections and normalisation layers in deep optimisation
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Nonlinear Sufficient Dimension Reduction with a Stochastic Neural
  Network
Nonlinear Sufficient Dimension Reduction with a Stochastic Neural Network
Siqi Liang
Y. Sun
F. Liang
BDL
71
11
0
09 Oct 2022
Adaptive Smoothness-weighted Adversarial Training for Multiple
  Perturbations with Its Stability Analysis
Adaptive Smoothness-weighted Adversarial Training for Multiple Perturbations with Its Stability Analysis
Jiancong Xiao
Zeyu Qin
Yanbo Fan
Baoyuan Wu
Jue Wang
Zhimin Luo
AAML
124
7
0
02 Oct 2022
Improved Algorithms for Neural Active Learning
Improved Algorithms for Neural Active Learning
Yikun Ban
Yuheng Zhang
Hanghang Tong
A. Banerjee
Jingrui He
AI4TS
61
12
0
02 Oct 2022
Restricted Strong Convexity of Deep Learning Models with Smooth
  Activations
Restricted Strong Convexity of Deep Learning Models with Smooth Activations
A. Banerjee
Pedro Cisneros-Velarde
Libin Zhu
M. Belkin
73
8
0
29 Sep 2022
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Magnitude and Angle Dynamics in Training Single ReLU Neurons
Sangmin Lee
Byeongsu Sim
Jong Chul Ye
MLT
137
6
0
27 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural
  Networks in $1d$
Approximation results for Gradient Descent trained Shallow Neural Networks in 1d1d1d
R. Gentile
G. Welper
ODL
102
7
0
17 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
104
21
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection
  Search
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
AI4CE
90
17
0
15 Sep 2022
Visualizing high-dimensional loss landscapes with Hessian directions
Visualizing high-dimensional loss landscapes with Hessian directions
Lucas Böttcher
Gregory R. Wheeler
79
14
0
28 Aug 2022
A Sublinear Adversarial Training Algorithm
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
77
25
0
10 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over
  Heavy Ball Method in Training Over-Parameterized Neural Networks
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
76
1
0
08 Aug 2022
Feature selection with gradient descent on two-layer networks in
  low-rotation regimes
Feature selection with gradient descent on two-layer networks in low-rotation regimes
Matus Telgarsky
MLT
81
16
0
04 Aug 2022
Gradient descent provably escapes saddle points in the training of
  shallow ReLU networks
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
103
5
0
03 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
97
47
0
26 Jul 2022
Can we achieve robustness from data alone?
Can we achieve robustness from data alone?
Nikolaos Tsilivis
Jingtong Su
Julia Kempe
OODDD
108
18
0
24 Jul 2022
Deep Sequence Models for Text Classification Tasks
Deep Sequence Models for Text Classification Tasks
S. S. Abdullahi
Su Yiming
Shamsuddeen Hassan Muhammad
A. Mustapha
Ahmad Muhammad Aminu
Abdulkadir Abdullahi
Musa Bello
Saminu Mohammad Aliyu
53
3
0
18 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh Chawla
66
7
0
13 Jul 2022
Synergy and Symmetry in Deep Learning: Interactions between the Data,
  Model, and Inference Algorithm
Synergy and Symmetry in Deep Learning: Interactions between the Data, Model, and Inference Algorithm
Lechao Xiao
Jeffrey Pennington
101
10
0
11 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
104
29
0
08 Jul 2022
Informed Learning by Wide Neural Networks: Convergence, Generalization
  and Sampling Complexity
Informed Learning by Wide Neural Networks: Convergence, Generalization and Sampling Complexity
Jianyi Yang
Shaolei Ren
78
3
0
02 Jul 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A
  Worst Case Analysis
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
97
15
0
26 Jun 2022
Limitations of the NTK for Understanding Generalization in Deep Learning
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas
Yamini Bansal
Preetum Nakkiran
116
34
0
20 Jun 2022
On the fast convergence of minibatch heavy ball momentum
On the fast convergence of minibatch heavy ball momentum
Raghu Bollapragada
Tyler Chen
Rachel A. Ward
110
19
0
15 Jun 2022
From Perception to Programs: Regularize, Overparameterize, and Amortize
From Perception to Programs: Regularize, Overparameterize, and Amortize
Hao Tang
Kevin Ellis
NAI
82
10
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient
  Algorithms
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
63
2
0
13 Jun 2022
What is a Good Metric to Study Generalization of Minimax Learners?
What is a Good Metric to Study Generalization of Minimax Learners?
Asuman Ozdaglar
S. Pattathil
Jiawei Zhang
Kai Zhang
66
14
0
09 Jun 2022
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural
  Networks
Adversarial Noises Are Linearly Separable for (Nearly) Random Neural Networks
Huishuai Zhang
Da Yu
Yiping Lu
Di He
AAML
98
1
0
09 Jun 2022
Spectral Bias Outside the Training Set for Deep Networks in the Kernel
  Regime
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime
Benjamin Bowman
Guido Montúfar
82
15
0
06 Jun 2022
The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at
  Initialization
The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization
Mufan Li
Mihai Nica
Daniel M. Roy
104
39
0
06 Jun 2022
Previous
12345...8910
Next