ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
How DNNs break the Curse of Dimensionality: Compositionality and Symmetry Learning
Arthur Jacot
Seok Hoan Choi
Yuxiao Wen
AI4CE
143
2
0
08 Jul 2024
Evaluating the design space of diffusion-based generative models
Evaluating the design space of diffusion-based generative models
Yuqing Wang
Ye He
Molei Tao
DiffM
101
6
0
18 Jun 2024
Precise analysis of ridge interpolators under heavy correlations -- a
  Random Duality Theory view
Precise analysis of ridge interpolators under heavy correlations -- a Random Duality Theory view
Mihailo Stojnic
54
1
0
13 Jun 2024
Ridge interpolators in correlated factor regression models -- exact risk
  analysis
Ridge interpolators in correlated factor regression models -- exact risk analysis
Mihailo Stojnic
62
1
0
13 Jun 2024
Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks:
  Margin Improvement and Fast Optimization
Large Stepsize Gradient Descent for Non-Homogeneous Two-Layer Networks: Margin Improvement and Fast Optimization
Yuhang Cai
Jingfeng Wu
Song Mei
Michael Lindsey
Peter L. Bartlett
91
4
0
12 Jun 2024
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
Loss Gradient Gaussian Width based Generalization and Optimization Guarantees
A. Banerjee
Qiaobo Li
Yingxue Zhou
162
0
0
11 Jun 2024
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization
  by Large Step Sizes
Stable Minima Cannot Overfit in Univariate ReLU Networks: Generalization by Large Step Sizes
Dan Qiao
Kaiqi Zhang
Esha Singh
Daniel Soudry
Yu-Xiang Wang
NoLa
89
4
0
10 Jun 2024
Get rich quick: exact solutions reveal how unbalanced initializations
  promote rapid feature learning
Get rich quick: exact solutions reveal how unbalanced initializations promote rapid feature learning
D. Kunin
Allan Raventós
Clémentine Dominé
Feng Chen
David Klindt
Andrew M. Saxe
Surya Ganguli
MLT
129
18
0
10 Jun 2024
Error Bounds of Supervised Classification from Information-Theoretic
  Perspective
Error Bounds of Supervised Classification from Information-Theoretic Perspective
Binchuan Qi
Wei Gong
Li Li
60
0
0
07 Jun 2024
Cyclic Sparse Training: Is it Enough?
Cyclic Sparse Training: Is it Enough?
Advait Gadhikar
Sree Harsha Nelaturu
R. Burkholz
CLL
101
0
0
04 Jun 2024
Improving Generalization and Convergence by Enhancing Implicit
  Regularization
Improving Generalization and Convergence by Enhancing Implicit Regularization
Mingze Wang
Haotian He
Jinbo Wang
Zilin Wang
Guanhua Huang
Feiyu Xiong
Zhiyu Li
E. Weinan
Lei Wu
96
8
0
31 May 2024
Recurrent Natural Policy Gradient for POMDPs
Recurrent Natural Policy Gradient for POMDPs
Semih Cayci
A. Eryilmaz
91
1
0
28 May 2024
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Mixed Dynamics In Linear Networks: Unifying the Lazy and Active Regimes
Zhenfeng Tu
Santiago Aranguri
Arthur Jacot
67
11
0
27 May 2024
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep
  Reinforcement Learning
SF-DQN: Provable Knowledge Transfer using Successor Feature for Deep Reinforcement Learning
Shuai Zhang
Heshan Devaka Fernando
Miao Liu
K. Murugesan
Songtao Lu
Pin-Yu Chen
Tianyi Chen
Meng Wang
75
2
0
24 May 2024
Novel Kernel Models and Exact Representor Theory for Neural Networks
  Beyond the Over-Parameterized Regime
Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime
A. Shilton
Sunil R. Gupta
Santu Rana
Svetha Venkatesh
62
11
0
24 May 2024
Bounds for the smallest eigenvalue of the NTK for arbitrary spherical
  data of arbitrary dimension
Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension
Kedar Karhadkar
Michael Murray
Guido Montúfar
105
3
0
23 May 2024
Approximation and Gradient Descent Training with Neural Networks
Approximation and Gradient Descent Training with Neural Networks
G. Welper
64
1
0
19 May 2024
Error Analysis of Three-Layer Neural Network Trained with PGD for Deep
  Ritz Method
Error Analysis of Three-Layer Neural Network Trained with PGD for Deep Ritz Method
Yuling Jiao
Yanming Lai
Yang Wang
AI4CE
45
1
0
19 May 2024
Train Faster, Perform Better: Modular Adaptive Training in
  Over-Parameterized Models
Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models
Yubin Shi
Yixuan Chen
Mingzhi Dong
Xiaochen Yang
Dongsheng Li
...
Yingying Zhao
Fan Yang
Tun Lu
Ning Gu
L. Shang
MoMe
81
4
0
13 May 2024
An Improved Finite-time Analysis of Temporal Difference Learning with
  Deep Neural Networks
An Improved Finite-time Analysis of Temporal Difference Learning with Deep Neural Networks
Zhifa Ke
Zaiwen Wen
Junyu Zhang
88
0
0
07 May 2024
Differentially Private Federated Learning without Noise Addition: When
  is it Possible?
Differentially Private Federated Learning without Noise Addition: When is it Possible?
Jiang Zhang
Konstantinos Psounis
FedML
107
0
0
06 May 2024
Graph is all you need? Lightweight data-agnostic neural architecture search without training
Graph is all you need? Lightweight data-agnostic neural architecture search without training
Zhenhan Huang
Tejaswini Pedapati
Pin-Yu Chen
Chunheng Jiang
Jianxi Gao
GNN
88
1
0
02 May 2024
Neural Dynamic Data Valuation
Neural Dynamic Data Valuation
Zhangyong Liang
Huanhuan Gao
Ji Zhang
TDI
88
1
0
30 Apr 2024
On the Rashomon ratio of infinite hypothesis sets
On the Rashomon ratio of infinite hypothesis sets
Evzenie Coupkova
Mireille Boutin
70
1
0
27 Apr 2024
Regularized Gauss-Newton for Optimizing Overparameterized Neural
  Networks
Regularized Gauss-Newton for Optimizing Overparameterized Neural Networks
Adeyemi Damilare Adeoye
Philipp Christian Petersen
Alberto Bemporad
65
1
0
23 Apr 2024
The Positivity of the Neural Tangent Kernel
The Positivity of the Neural Tangent Kernel
Luís Carvalho
Joao L. Costa
José Mourao
Gonccalo Oliveira
86
3
0
19 Apr 2024
The phase diagram of kernel interpolation in large dimensions
The phase diagram of kernel interpolation in large dimensions
Haobo Zhang
Weihao Lu
Qian Lin
76
6
0
19 Apr 2024
Learning epidemic trajectories through Kernel Operator Learning: from
  modelling to optimal control
Learning epidemic trajectories through Kernel Operator Learning: from modelling to optimal control
Giovanni Ziarelli
N. Parolini
M. Verani
106
2
0
17 Apr 2024
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks
Matteo Tucat
Anirbit Mukherjee
Procheta Sen
Mingfei Sun
Omar Rivasplata
MLT
91
1
0
12 Apr 2024
Understanding the Learning Dynamics of Alignment with Human Feedback
Understanding the Learning Dynamics of Alignment with Human Feedback
Shawn Im
Yixuan Li
ALM
107
14
0
27 Mar 2024
Robust NAS under adversarial training: benchmark, theory, and beyond
Robust NAS under adversarial training: benchmark, theory, and beyond
Yongtao Wu
Fanghui Liu
Carl-Johann Simon-Gabriel
Grigorios G. Chrysos
Volkan Cevher
AAMLOOD
92
5
0
19 Mar 2024
NTK-Guided Few-Shot Class Incremental Learning
NTK-Guided Few-Shot Class Incremental Learning
Jingren Liu
Zhong Ji
Yanwei Pang
YunLong Yu
CLL
95
4
0
19 Mar 2024
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Generalization of Scaled Deep ResNets in the Mean-Field Regime
Yihang Chen
Fanghui Liu
Yiping Lu
Grigorios G. Chrysos
Volkan Cevher
71
2
0
14 Mar 2024
Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems
  trained with Gradient Descent
Recovery Guarantees of Unsupervised Neural Networks for Inverse Problems trained with Gradient Descent
Nathan Buskulic
M. Fadili
Yvain Quéau
85
1
0
08 Mar 2024
KIWI: A Dataset of Knowledge-Intensive Writing Instructions for
  Answering Research Questions
KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions
Fangyuan Xu
Kyle Lo
Luca Soldaini
Bailey Kuehl
Eunsol Choi
David Wadden
84
9
0
06 Mar 2024
The Implicit Bias of Heterogeneity towards Invariance: A Study of Multi-Environment Matrix Sensing
The Implicit Bias of Heterogeneity towards Invariance: A Study of Multi-Environment Matrix Sensing
Yang Xu
Yihong Gu
Cong Fang
93
0
0
03 Mar 2024
Merging Text Transformer Models from Different Initializations
Merging Text Transformer Models from Different Initializations
Neha Verma
Maha Elbayad
MoMe
119
8
0
01 Mar 2024
Masks, Signs, And Learning Rate Rewinding
Masks, Signs, And Learning Rate Rewinding
Advait Gadhikar
R. Burkholz
100
10
0
29 Feb 2024
Uncertainty Quantification of Graph Convolution Neural Network Models of
  Evolving Processes
Uncertainty Quantification of Graph Convolution Neural Network Models of Evolving Processes
J. Hauth
Cosmin Safta
Xun Huan
Ravi G. Patel
Reese E. Jones
BDLUQCV
82
2
0
17 Feb 2024
Fixed width treelike neural networks capacity analysis -- generic
  activations
Fixed width treelike neural networks capacity analysis -- generic activations
M. Stojnic
64
3
0
08 Feb 2024
Non-convergence to global minimizers for Adam and stochastic gradient
  descent optimization and constructions of local minimizers in the training of
  artificial neural networks
Non-convergence to global minimizers for Adam and stochastic gradient descent optimization and constructions of local minimizers in the training of artificial neural networks
Arnulf Jentzen
Adrian Riekert
71
4
0
07 Feb 2024
Analyzing the Neural Tangent Kernel of Periodically Activated Coordinate
  Networks
Analyzing the Neural Tangent Kernel of Periodically Activated Coordinate Networks
Hemanth Saratchandran
Shin-Fang Chng
Simon Lucey
100
2
0
07 Feb 2024
Deconstructing the Goldilocks Zone of Neural Network Initialization
Deconstructing the Goldilocks Zone of Neural Network Initialization
Artem Vysogorets
Anna Dawid
Julia Kempe
73
1
0
05 Feb 2024
Data-induced multiscale losses and efficient multirate gradient descent
  schemes
Data-induced multiscale losses and efficient multirate gradient descent schemes
Juncai He
Liangchen Liu
Yen-Hsi Tsai
90
0
0
05 Feb 2024
Architectural Strategies for the optimization of Physics-Informed Neural
  Networks
Architectural Strategies for the optimization of Physics-Informed Neural Networks
Hemanth Saratchandran
Shin-Fang Chng
Simon Lucey
AI4CE
73
0
0
05 Feb 2024
No Free Prune: Information-Theoretic Barriers to Pruning at
  Initialization
No Free Prune: Information-Theoretic Barriers to Pruning at Initialization
Tanishq Kumar
Kevin Luo
Mark Sellke
78
3
0
02 Feb 2024
Algebraic Complexity and Neurovariety of Linear Convolutional Networks
Algebraic Complexity and Neurovariety of Linear Convolutional Networks
Vahid Shahverdi
112
4
0
29 Jan 2024
Neural Network-Based Score Estimation in Diffusion Models: Optimization
  and Generalization
Neural Network-Based Score Estimation in Diffusion Models: Optimization and Generalization
Yinbin Han
Meisam Razaviyayn
Renyuan Xu
DiffM
138
16
0
28 Jan 2024
A Survey on Statistical Theory of Deep Learning: Approximation, Training
  Dynamics, and Generative Models
A Survey on Statistical Theory of Deep Learning: Approximation, Training Dynamics, and Generative Models
Namjoon Suh
Guang Cheng
MedIm
109
14
0
14 Jan 2024
Weak Correlations as the Underlying Principle for Linearization of
  Gradient-Based Learning Systems
Weak Correlations as the Underlying Principle for Linearization of Gradient-Based Learning Systems
Ori Shem-Ur
Yaron Oz
63
0
0
08 Jan 2024
Previous
12345...161718
Next