ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Finite Sample Identification of Wide Shallow Neural Networks with Biases
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
54
6
0
08 Nov 2022
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer
  Neural Networks
A Functional-Space Mean-Field Theory of Partially-Trained Three-Layer Neural Networks
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
77
5
0
28 Oct 2022
LOFT: Finding Lottery Tickets through Filter-wise Training
LOFT: Finding Lottery Tickets through Filter-wise Training
Qihan Wang
Chen Dun
Fangshuo Liao
C. Jermaine
Anastasios Kyrillidis
69
3
0
28 Oct 2022
Sparsity in Continuous-Depth Neural Networks
Sparsity in Continuous-Depth Neural Networks
H. Aliee
Till Richter
Mikhail Solonin
I. Ibarra
Fabian J. Theis
Niki Kilbertus
97
11
0
26 Oct 2022
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Pushing the Efficiency Limit Using Structured Sparse Convolutions
Vinay Kumar Verma
Nikhil Mehta
Shijing Si
Ricardo Henao
Lawrence Carin
50
3
0
23 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets
Global Convergence of SGD On Two Layer Neural Nets
Pulkit Gopalani
Anirbit Mukherjee
64
6
0
20 Oct 2022
Theoretical Guarantees for Permutation-Equivariant Quantum Neural
  Networks
Theoretical Guarantees for Permutation-Equivariant Quantum Neural Networks
Louis Schatzki
Martín Larocca
Quynh T. Nguyen
F. Sauvage
M. Cerezo
109
92
0
18 Oct 2022
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data
Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data
Spencer Frei
Gal Vardi
Peter L. Bartlett
Nathan Srebro
Wei Hu
MLT
83
42
0
13 Oct 2022
Mean-field analysis for heavy ball methods: Dropout-stability,
  connectivity, and global convergence
Mean-field analysis for heavy ball methods: Dropout-stability, connectivity, and global convergence
Diyuan Wu
Vyacheslav Kungurtsev
Marco Mondelli
60
3
0
13 Oct 2022
From Gradient Flow on Population Loss to Learning with Stochastic
  Gradient Descent
From Gradient Flow on Population Loss to Learning with Stochastic Gradient Descent
Satyen Kale
Jason D. Lee
Chris De Sa
Ayush Sekhari
Karthik Sridharan
49
4
0
13 Oct 2022
Few-shot Backdoor Attacks via Neural Tangent Kernels
Few-shot Backdoor Attacks via Neural Tangent Kernels
J. Hayase
Sewoong Oh
74
21
0
12 Oct 2022
Toward Sustainable Continual Learning: Detection and Knowledge
  Repurposing of Similar Tasks
Toward Sustainable Continual Learning: Detection and Knowledge Repurposing of Similar Tasks
Sijia Wang
Yoojin Choi
Junya Chen
Mostafa El-Khamy
Ricardo Henao
CLL
64
0
0
11 Oct 2022
A Kernel-Based View of Language Model Fine-Tuning
A Kernel-Based View of Language Model Fine-Tuning
Sadhika Malladi
Alexander Wettig
Dingli Yu
Danqi Chen
Sanjeev Arora
VLM
157
69
0
11 Oct 2022
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?
Nikolaos Tsilivis
Julia Kempe
AAML
98
20
0
11 Oct 2022
Efficient NTK using Dimensionality Reduction
Efficient NTK using Dimensionality Reduction
Nir Ailon
Supratim Shit
95
0
0
10 Oct 2022
On skip connections and normalisation layers in deep optimisation
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
74
2
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODLAI4CE
81
2
0
05 Oct 2022
Structural Estimation of Markov Decision Processes in High-Dimensional
  State Space with Finite-Time Guarantees
Structural Estimation of Markov Decision Processes in High-Dimensional State Space with Finite-Time Guarantees
Siliang Zeng
Mingyi Hong
Alfredo García
OffRL
83
12
0
04 Oct 2022
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss
  Landscape for Deep Networks
Plateau in Monotonic Linear Interpolation -- A "Biased" View of Loss Landscape for Deep Networks
Xiang Wang
Annie Wang
Mo Zhou
Rong Ge
MoMe
231
10
0
03 Oct 2022
A Combinatorial Perspective on the Optimization of Shallow ReLU Networks
A Combinatorial Perspective on the Optimization of Shallow ReLU Networks
Michael Matena
Colin Raffel
38
1
0
01 Oct 2022
On the optimization and generalization of overparameterized implicit
  neural networks
On the optimization and generalization of overparameterized implicit neural networks
Tianxiang Gao
Hongyang Gao
MLTAI4CE
65
3
0
30 Sep 2022
Neural Networks Efficiently Learn Low-Dimensional Representations with
  SGD
Neural Networks Efficiently Learn Low-Dimensional Representations with SGD
Alireza Mousavi-Hosseini
Sejun Park
M. Girotti
Ioannis Mitliagkas
Murat A. Erdogdu
MLT
379
50
0
29 Sep 2022
Stability and Generalization Analysis of Gradient Methods for Shallow
  Neural Networks
Stability and Generalization Analysis of Gradient Methods for Shallow Neural Networks
Yunwen Lei
Rong Jin
Yiming Ying
MLT
100
19
0
19 Sep 2022
Lazy vs hasty: linearization in deep networks impacts learning schedule
  based on example difficulty
Lazy vs hasty: linearization in deep networks impacts learning schedule based on example difficulty
Thomas George
Guillaume Lajoie
A. Baratin
87
6
0
19 Sep 2022
Approximation results for Gradient Descent trained Shallow Neural
  Networks in $1d$
Approximation results for Gradient Descent trained Shallow Neural Networks in 1d1d1d
R. Gentile
G. Welper
ODL
104
7
0
17 Sep 2022
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
  Polynomial Net Study
Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a Polynomial Net Study
Yongtao Wu
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
85
11
0
16 Sep 2022
Robustness in deep learning: The good (width), the bad (depth), and the
  ugly (initialization)
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization)
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
104
21
0
15 Sep 2022
Generalization Properties of NAS under Activation and Skip Connection
  Search
Generalization Properties of NAS under Activation and Skip Connection Search
Zhenyu Zhu
Fanghui Liu
Grigorios G. Chrysos
Volkan Cevher
AI4CE
100
17
0
15 Sep 2022
On the Trade-Off between Actionable Explanations and the Right to be
  Forgotten
On the Trade-Off between Actionable Explanations and the Right to be Forgotten
Martin Pawelczyk
Tobias Leemann
Asia J. Biega
Gjergji Kasneci
FaMLMU
109
23
0
30 Aug 2022
Neural Tangent Kernel: A Survey
Neural Tangent Kernel: A Survey
Eugene Golikov
Eduard Pokonechnyy
Vladimir Korviakov
76
14
0
29 Aug 2022
Overparameterization from Computational Constraints
Overparameterization from Computational Constraints
Sanjam Garg
S. Jha
Saeed Mahloujifar
Mohammad Mahmoody
Mingyuan Wang
47
2
0
27 Aug 2022
Universal Solutions of Feedforward ReLU Networks for Interpolations
Universal Solutions of Feedforward ReLU Networks for Interpolations
Changcun Huang
63
2
0
16 Aug 2022
Gaussian Process Surrogate Models for Neural Networks
Gaussian Process Surrogate Models for Neural Networks
Michael Y. Li
Erin Grant
Thomas Griffiths
BDLSyDa
109
8
0
11 Aug 2022
A Sublinear Adversarial Training Algorithm
A Sublinear Adversarial Training Algorithm
Yeqi Gao
Lianke Qin
Zhao Song
Yitan Wang
GAN
77
25
0
10 Aug 2022
Training Overparametrized Neural Networks in Sublinear Time
Training Overparametrized Neural Networks in Sublinear Time
Yichuan Deng
Han Hu
Zhao Song
Omri Weinstein
Danyang Zhuo
BDL
99
28
0
09 Aug 2022
Neural Set Function Extensions: Learning with Discrete Functions in High
  Dimensions
Neural Set Function Extensions: Learning with Discrete Functions in High Dimensions
Nikolaos Karalias
Joshua Robinson
Andreas Loukas
Stefanie Jegelka
128
9
0
08 Aug 2022
Provable Acceleration of Nesterov's Accelerated Gradient Method over
  Heavy Ball Method in Training Over-Parameterized Neural Networks
Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks
Xin Liu
Wei Tao
Wei Li
Dazhi Zhan
Jun Wang
Zhisong Pan
ODL
84
1
0
08 Aug 2022
On Fast Simulation of Dynamical System with Neural Vector Enhanced
  Numerical Solver
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver
Zhongzhan Huang
Senwei Liang
Hong Zhang
Haizhao Yang
Liang Lin
AI4CE
101
9
0
07 Aug 2022
Federated Adversarial Learning: A Framework with Convergence Analysis
Federated Adversarial Learning: A Framework with Convergence Analysis
Xiaoxiao Li
Zhao Song
Jiaming Yang
FedML
92
21
0
07 Aug 2022
Towards Understanding Mixture of Experts in Deep Learning
Towards Understanding Mixture of Experts in Deep Learning
Zixiang Chen
Yihe Deng
Yue-bo Wu
Quanquan Gu
Yuan-Fang Li
MLTMoE
97
60
0
04 Aug 2022
Agnostic Learning of General ReLU Activation Using Gradient Descent
Agnostic Learning of General ReLU Activation Using Gradient Descent
Pranjal Awasthi
Alex K. Tang
Aravindan Vijayaraghavan
MLT
64
7
0
04 Aug 2022
Gradient descent provably escapes saddle points in the training of
  shallow ReLU networks
Gradient descent provably escapes saddle points in the training of shallow ReLU networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
108
5
0
03 Aug 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge
  of Stability
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
97
47
0
26 Jul 2022
Hidden Progress in Deep Learning: SGD Learns Parities Near the
  Computational Limit
Hidden Progress in Deep Learning: SGD Learns Parities Near the Computational Limit
Boaz Barak
Benjamin L. Edelman
Surbhi Goel
Sham Kakade
Eran Malach
Cyril Zhang
114
133
0
18 Jul 2022
The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural
  Network
The Lottery Ticket Hypothesis for Self-attention in Convolutional Neural Network
Zhongzhan Huang
Senwei Liang
Mingfu Liang
Wei He
Haizhao Yang
Liang Lin
72
9
0
16 Jul 2022
Riemannian Natural Gradient Methods
Riemannian Natural Gradient Methods
Jiang Hu
Ruicheng Ao
Anthony Man-Cho So
Minghan Yang
Zaiwen Wen
67
11
0
15 Jul 2022
Efficient Augmentation for Imbalanced Deep Learning
Efficient Augmentation for Imbalanced Deep Learning
Damien Dablain
C. Bellinger
Bartosz Krawczyk
Nitesh Chawla
66
7
0
13 Jul 2022
Implicit Bias of Gradient Descent on Reparametrized Models: On
  Equivalence to Mirror Descent
Implicit Bias of Gradient Descent on Reparametrized Models: On Equivalence to Mirror Descent
Zhiyuan Li
Tianhao Wang
Jason D. Lee
Sanjeev Arora
110
29
0
08 Jul 2022
Learning and generalization of one-hidden-layer neural networks, going
  beyond standard Gaussian data
Learning and generalization of one-hidden-layer neural networks, going beyond standard Gaussian data
Hongkang Li
Shuai Zhang
Ming Wang
MLT
75
8
0
07 Jul 2022
Neural Stein critics with staged $L^2$-regularization
Neural Stein critics with staged L2L^2L2-regularization
Matthew Repasky
Xiuyuan Cheng
Yao Xie
61
3
0
07 Jul 2022
Previous
123...567...161718
Next