ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1811.03804
  4. Cited By
Gradient Descent Finds Global Minima of Deep Neural Networks
v1v2v3v4 (latest)

Gradient Descent Finds Global Minima of Deep Neural Networks

9 November 2018
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
    ODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Finds Global Minima of Deep Neural Networks"

50 / 466 papers shown
Title
Joint Tensor-Train Parameterization for Efficient and Expressive Low-Rank Adaptation
Joint Tensor-Train Parameterization for Efficient and Expressive Low-Rank Adaptation
Jun Qi
Chen-Yu Liu
Sabato Marco Siniscalchi
Chao-Han Huck Yang
Min-hsiu Hsieh
15
0
0
19 Jun 2025
Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
Generalization Bound of Gradient Flow through Training Trajectory and Data-dependent Kernel
Yilan Chen
Zhichao Wang
Wei Huang
Andi Han
Taiji Suzuki
Arya Mazumdar
MLT
20
0
0
12 Jun 2025
Model Reprogramming Demystified: A Neural Tangent Kernel Perspective
Model Reprogramming Demystified: A Neural Tangent Kernel Perspective
Ming-Yu Chung
Jiashuo Fan
Hancheng Ye
Qinsi Wang
Wei-Chen Shen
Chia-Mu Yu
Pin-Yu Chen
Sy-Yen Kuo
24
1
0
31 May 2025
Querying Kernel Methods Suffices for Reconstructing their Training Data
Querying Kernel Methods Suffices for Reconstructing their Training Data
Daniel Barzilai
Yuval Margalit
Eitan Gronich
Gilad Yehudai
Meirav Galun
Ronen Basri
35
0
0
25 May 2025
TULiP: Test-time Uncertainty Estimation via Linearization and Weight Perturbation
TULiP: Test-time Uncertainty Estimation via Linearization and Weight Perturbation
Yuhui Zhang
Dongshen Wu
Yuichiro Wada
Takafumi Kanamori
OODD
241
1
0
22 May 2025
Training NTK to Generalize with KARE
Training NTK to Generalize with KARE
Johannes Schwab
Bryan Kelly
Semyon Malamud
Teng Andrea Xu
146
0
0
16 May 2025
Divergence of Empirical Neural Tangent Kernel in Classification Problems
Divergence of Empirical Neural Tangent Kernel in Classification Problems
Zixiong Yu
Songtao Tian
Guhan Chen
68
0
0
15 Apr 2025
On the Cone Effect in the Learning Dynamics
On the Cone Effect in the Learning Dynamics
Zhanpeng Zhou
Yongyi Yang
Jie Ren
Mahito Sugiyama
Junchi Yan
116
0
0
20 Mar 2025
Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits
Escaping from the Barren Plateau via Gaussian Initializations in Deep Variational Quantum Circuits
Kaining Zhang
Liu Liu
Min-hsiu Hsieh
Dacheng Tao
141
63
0
20 Feb 2025
Stability and Generalization in Free Adversarial Training
Stability and Generalization in Free Adversarial Training
Xiwei Cheng
Kexin Fu
Farzan Farnia
AAML
84
3
0
08 Jan 2025
Multi-Label Bayesian Active Learning with Inter-Label Relationships
Multi-Label Bayesian Active Learning with Inter-Label Relationships
Yuanyuan Qi
Jueqing Lu
Xiaohao Yang
Joanne Enticott
Lan Du
181
0
0
26 Nov 2024
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
Variance-Aware Linear UCB with Deep Representation for Neural Contextual Bandits
H. Bui
Enrique Mallada
Anqi Liu
509
1
0
08 Nov 2024
Sharper Guarantees for Learning Neural Network Classifiers with Gradient
  Methods
Sharper Guarantees for Learning Neural Network Classifiers with Gradient Methods
Hossein Taheri
Christos Thrampoulidis
Arya Mazumdar
MLT
119
0
0
13 Oct 2024
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Adversarial Training Can Provably Improve Robustness: Theoretical Analysis of Feature Learning Process Under Structured Data
Binghui Li
Yuanzhi Li
OOD
94
4
0
11 Oct 2024
Extended convexity and smoothness and their applications in deep learning
Extended convexity and smoothness and their applications in deep learning
Binchuan Qi
Wei Gong
Li Li
107
0
0
08 Oct 2024
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Measuring and Controlling Solution Degeneracy across Task-Trained Recurrent Neural Networks
Ann Huang
Satpreet H. Singh
Flavio Martinelli
Kanaka Rajan
96
0
0
04 Oct 2024
Simplicity bias and optimization threshold in two-layer ReLU networks
Simplicity bias and optimization threshold in two-layer ReLU networks
Etienne Boursier
Nicolas Flammarion
93
4
0
03 Oct 2024
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
From Lazy to Rich: Exact Learning Dynamics in Deep Linear Networks
Clémentine Dominé
Nicolas Anguita
A. Proca
Lukas Braun
D. Kunin
P. Mediano
Andrew M. Saxe
131
6
0
22 Sep 2024
Monomial Matrix Group Equivariant Neural Functional Networks
Monomial Matrix Group Equivariant Neural Functional Networks
Hoang V. Tran
Thieu N. Vo
Tho H. Tran
An T. Nguyen
Tan M. Nguyen
140
9
0
18 Sep 2024
On the Pinsker bound of inner product kernel regression in large dimensions
On the Pinsker bound of inner product kernel regression in large dimensions
Weihao Lu
Jialin Ding
Haobo Zhang
Qian Lin
93
1
0
02 Sep 2024
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
MAP: Low-compute Model Merging with Amortized Pareto Fronts via Quadratic Approximation
Lu Li
Tianze Zhang
Zhiqi Bu
Suyuchen Wang
Huan He
Jie Fu
Yonghui Wu
Jiang Bian
Yong Chen
Yoshua Bengio
FedMLMoMe
119
6
0
11 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
158
0
0
11 Jun 2024
Reparameterization invariance in approximate Bayesian inference
Reparameterization invariance in approximate Bayesian inference
Hrittik Roy
M. Miani
Carl Henrik Ek
Philipp Hennig
Marvin Pfortner
Lukas Tatzel
Søren Hauberg
BDL
119
9
0
05 Jun 2024
Approximation with Random Shallow ReLU Networks with Applications to
  Model Reference Adaptive Control
Approximation with Random Shallow ReLU Networks with Applications to Model Reference Adaptive Control
Andrew G. Lamperski
Tyler Lekang
53
3
0
25 Mar 2024
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
Sobihan Surendran
Antoine Godichon-Baggioni
Adeline Fermanian
Sylvain Le Corff
109
2
0
05 Feb 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization
Critical Influence of Overparameterization on Sharpness-aware Minimization
Sungbin Shin
Dongyeop Lee
Maksym Andriushchenko
Namhoon Lee
AAML
160
2
0
29 Nov 2023
Distributed Constrained Combinatorial Optimization leveraging Hypergraph
  Neural Networks
Distributed Constrained Combinatorial Optimization leveraging Hypergraph Neural Networks
Nasimeh Heydaribeni
Xinrui Zhan
Ruisi Zhang
Tina Eliassi-Rad
F. Koushanfar
AI4CE
101
11
0
15 Nov 2023
Initialization Matters: Privacy-Utility Analysis of Overparameterized
  Neural Networks
Initialization Matters: Privacy-Utility Analysis of Overparameterized Neural Networks
Jiayuan Ye
Zhenyu Zhu
Fanghui Liu
Reza Shokri
Volkan Cevher
87
13
0
31 Oct 2023
How Graph Neural Networks Learn: Lessons from Training Dynamics
How Graph Neural Networks Learn: Lessons from Training Dynamics
Chenxiao Yang
Qitian Wu
David Wipf
Ruoyu Sun
Junchi Yan
AI4CEGNN
62
1
0
08 Oct 2023
Multilayer Multiset Neuronal Networks -- MMNNs
Multilayer Multiset Neuronal Networks -- MMNNs
Alexandre Benatti
L. D. F. Costa
55
1
0
28 Aug 2023
Six Lectures on Linearized Neural Networks
Six Lectures on Linearized Neural Networks
Theodor Misiakiewicz
Andrea Montanari
134
13
0
25 Aug 2023
Quantifying the Optimization and Generalization Advantages of Graph Neural Networks Over Multilayer Perceptrons
Quantifying the Optimization and Generalization Advantages of Graph Neural Networks Over Multilayer Perceptrons
Wei Huang
Yuanbin Cao
Hong Wang
Xin Cao
Taiji Suzuki
MLT
86
8
0
24 Jun 2023
Gradient is All You Need?
Gradient is All You Need?
Konstantin Riedl
T. Klock
Carina Geldhauser
M. Fornasier
57
8
0
16 Jun 2023
Query Complexity of Active Learning for Function Family With Nearly
  Orthogonal Basis
Query Complexity of Active Learning for Function Family With Nearly Orthogonal Basis
Xiangyi Chen
Zhao Song
Baochen Sun
Junze Yin
Danyang Zhuo
88
3
0
06 Jun 2023
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Initial Guessing Bias: How Untrained Networks Favor Some Classes
Emanuele Francazi
Aurelien Lucchi
Marco Baity-Jesi
AI4CE
75
4
0
01 Jun 2023
Benign Overfitting in Deep Neural Networks under Lazy Training
Benign Overfitting in Deep Neural Networks under Lazy Training
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Francesco Locatello
Volkan Cevher
AI4CE
66
10
0
30 May 2023
An Analytic End-to-End Deep Learning Algorithm based on Collaborative
  Learning
An Analytic End-to-End Deep Learning Algorithm based on Collaborative Learning
Sitan Li
C. Cheah
47
1
0
26 May 2023
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
109
79
0
25 May 2023
Fast Convergence in Learning Two-Layer Neural Networks with Separable
  Data
Fast Convergence in Learning Two-Layer Neural Networks with Separable Data
Hossein Taheri
Christos Thrampoulidis
MLT
49
3
0
22 May 2023
Loss Spike in Training Neural Networks
Loss Spike in Training Neural Networks
Zhongwang Zhang
Z. Xu
72
7
0
20 May 2023
Mode Connectivity in Auction Design
Mode Connectivity in Auction Design
Christoph Hertrich
Yixin Tao
László A. Végh
69
1
0
18 May 2023
Efficient Asynchronize Stochastic Gradient Algorithm with Structured
  Data
Efficient Asynchronize Stochastic Gradient Algorithm with Structured Data
Zhao Song
Mingquan Ye
67
4
0
13 May 2023
Function Approximation with Randomly Initialized Neural Networks for
  Approximate Model Reference Adaptive Control
Function Approximation with Randomly Initialized Neural Networks for Approximate Model Reference Adaptive Control
Tyler Lekang
Andrew G. Lamperski
50
0
0
28 Mar 2023
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Phase Diagram of Initial Condensation for Two-layer Neural Networks
Zheng Chen
Yuqing Li
Yaoyu Zhang
Zhaoguang Zhou
Z. Xu
MLTAI4CE
104
11
0
12 Mar 2023
Linear CNNs Discover the Statistical Structure of the Dataset Using Only
  the Most Dominant Frequencies
Linear CNNs Discover the Statistical Structure of the Dataset Using Only the Most Dominant Frequencies
Hannah Pinson
Joeri Lenaerts
V. Ginis
53
3
0
03 Mar 2023
Implicit Stochastic Gradient Descent for Training Physics-informed
  Neural Networks
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks
Ye Li
Songcan Chen
Shengyi Huang
PINN
51
3
0
03 Mar 2023
On the existence of minimizers in shallow residual ReLU neural network
  optimization landscapes
On the existence of minimizers in shallow residual ReLU neural network optimization landscapes
Steffen Dereich
Arnulf Jentzen
Sebastian Kassing
65
7
0
28 Feb 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
105
16
0
20 Feb 2023
Global Convergence Rate of Deep Equilibrium Models with General Activations
Global Convergence Rate of Deep Equilibrium Models with General Activations
Lan V. Truong
127
2
0
11 Feb 2023
Rethinking Gauss-Newton for learning over-parameterized models
Rethinking Gauss-Newton for learning over-parameterized models
Michael Arbel
Romain Menegaux
Pierre Wolinski
AI4CE
82
6
0
06 Feb 2023
1234...8910
Next