ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Neural Networks can Learn Representations with Gradient Descent
Neural Networks can Learn Representations with Gradient Descent
Alexandru Damian
Jason D. Lee
Mahdi Soltanolkotabi
SSLMLT
102
123
0
30 Jun 2022
Theoretical Perspectives on Deep Learning Methods in Inverse Problems
Theoretical Perspectives on Deep Learning Methods in Inverse Problems
Jonathan Scarlett
Reinhard Heckel
M. Rodrigues
Paul Hand
Yonina C. Eldar
AI4CE
110
32
0
29 Jun 2022
Bounding the Width of Neural Networks via Coupled Initialization -- A
  Worst Case Analysis
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis
Alexander Munteanu
Simon Omlor
Zhao Song
David P. Woodruff
97
15
0
26 Jun 2022
Learning sparse features can lead to overfitting in neural networks
Learning sparse features can lead to overfitting in neural networks
Leonardo Petrini
Francesco Cagnetta
Eric Vanden-Eijnden
Matthieu Wyart
MLT
101
26
0
24 Jun 2022
Optical Flow Regularization of Implicit Neural Representations for Video
  Frame Interpolation
Optical Flow Regularization of Implicit Neural Representations for Video Frame Interpolation
Weihao Zhuang
T. Hascoet
R. Takashima
T. Takiguchi
75
3
0
22 Jun 2022
Limitations of the NTK for Understanding Generalization in Deep Learning
Limitations of the NTK for Understanding Generalization in Deep Learning
Nikhil Vyas
Yamini Bansal
Preetum Nakkiran
116
34
0
20 Jun 2022
Adversarial Robustness is at Odds with Lazy Training
Adversarial Robustness is at Odds with Lazy Training
Yunjuan Wang
Enayat Ullah
Poorya Mianjy
R. Arora
SILMAAML
114
11
0
18 Jun 2022
Large-width asymptotics for ReLU neural networks with $α$-Stable
  initializations
Large-width asymptotics for ReLU neural networks with ααα-Stable initializations
Stefano Favaro
S. Fortini
Stefano Peluchetti
50
2
0
16 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural
  Networks
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks
Kaiqi Zhang
Ming Yin
Yu Wang
MQ
74
5
0
13 Jun 2022
On the Convergence to a Global Solution of Shuffling-Type Gradient
  Algorithms
On the Convergence to a Global Solution of Shuffling-Type Gradient Algorithms
Lam M. Nguyen
Trang H. Tran
70
2
0
13 Jun 2022
Analysis of Branch Specialization and its Application in Image
  Decomposition
Analysis of Branch Specialization and its Application in Image Decomposition
Jonathan Brokman
Guy Gilboa
59
2
0
12 Jun 2022
Gradient Boosting Performs Gaussian Process Inference
Gradient Boosting Performs Gaussian Process Inference
Aleksei Ustimenko
Artem Beliakov
Liudmila Prokhorenkova
BDL
79
5
0
11 Jun 2022
Parameter Convex Neural Networks
Parameter Convex Neural Networks
Jingcheng Zhou
Wei Wei
Xing Li
Bowen Pang
Zhiming Zheng
21
0
0
11 Jun 2022
Neural Collapse: A Review on Modelling Principles and Generalization
Neural Collapse: A Review on Modelling Principles and Generalization
Vignesh Kothapalli
158
82
0
08 Jun 2022
Identifying good directions to escape the NTK regime and efficiently
  learn low-degree plus sparse polynomials
Identifying good directions to escape the NTK regime and efficiently learn low-degree plus sparse polynomials
Eshaan Nichani
Yunzhi Bai
Jason D. Lee
85
10
0
08 Jun 2022
Spectral Bias Outside the Training Set for Deep Networks in the Kernel
  Regime
Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime
Benjamin Bowman
Guido Montúfar
82
15
0
06 Jun 2022
Non-convex online learning via algorithmic equivalence
Non-convex online learning via algorithmic equivalence
Udaya Ghai
Zhou Lu
Elad Hazan
94
11
0
30 May 2022
Long-Tailed Learning Requires Feature Learning
Long-Tailed Learning Requires Feature Learning
T. Laurent
J. V. Brecht
Xavier Bresson
VLM
93
1
0
29 May 2022
Global Convergence of Over-parameterized Deep Equilibrium Models
Global Convergence of Over-parameterized Deep Equilibrium Models
Zenan Ling
Xingyu Xie
Qiuhao Wang
Zongpeng Zhang
Zhouchen Lin
97
12
0
27 May 2022
A Framework for Overparameterized Learning
A Framework for Overparameterized Learning
Dávid Terjék
Diego González-Sánchez
MLT
50
1
0
26 May 2022
Empirical Phase Diagram for Three-layer Neural Networks with Infinite
  Width
Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width
Hanxu Zhou
Qixuan Zhou
Zhenyuan Jin
Yaoyu Zhang
Yaoyu Zhang
Zhi-Qin John Xu
57
22
0
24 May 2022
Quadratic models for understanding catapult dynamics of neural networks
Quadratic models for understanding catapult dynamics of neural networks
Libin Zhu
Chaoyue Liu
Adityanarayanan Radhakrishnan
M. Belkin
96
14
0
24 May 2022
Transition to Linearity of General Neural Networks with Directed Acyclic
  Graph Architecture
Transition to Linearity of General Neural Networks with Directed Acyclic Graph Architecture
Libin Zhu
Chaoyue Liu
M. Belkin
GNNAI4CE
62
4
0
24 May 2022
Memorization and Optimization in Deep Neural Networks with Minimum
  Over-parameterization
Memorization and Optimization in Deep Neural Networks with Minimum Over-parameterization
Simone Bombari
Mohammad Hossein Amani
Marco Mondelli
87
26
0
20 May 2022
Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with
  Linear Convergence Rates
Mean-Field Analysis of Two-Layer Neural Networks: Global Optimality with Linear Convergence Rates
Jingwei Zhang
Xunpeng Huang
Jincheng Yu
MLT
54
1
0
19 May 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU
  Networks: Convergence Guarantees and Implicit Bias
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
109
24
0
18 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
101
19
0
18 May 2022
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep
  Neural Network, a Survey
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey
Paul Wimmer
Jens Mehnert
Alexandru Paul Condurache
DD
98
21
0
17 May 2022
Gradient Descent Optimizes Infinite-Depth ReLU Implicit Networks with
  Linear Widths
Gradient Descent Optimizes Infinite-Depth ReLU Implicit Networks with Linear Widths
Tianxiang Gao
Hongyang Gao
MLT
79
5
0
16 May 2022
Deep Architecture Connectivity Matters for Its Convergence: A
  Fine-Grained Analysis
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis
Wuyang Chen
Wei-Ping Huang
Xinyu Gong
Boris Hanin
Zhangyang Wang
92
7
0
11 May 2022
High-dimensional Asymptotics of Feature Learning: How One Gradient Step
  Improves the Representation
High-dimensional Asymptotics of Feature Learning: How One Gradient Step Improves the Representation
Jimmy Ba
Murat A. Erdogdu
Taiji Suzuki
Zhichao Wang
Denny Wu
Greg Yang
MLT
99
129
0
03 May 2022
Dynamic Programming in Rank Space: Scaling Structured Inference with
  Low-Rank HMMs and PCFGs
Dynamic Programming in Rank Space: Scaling Structured Inference with Low-Rank HMMs and PCFGs
Aaron Courville
Wei Liu
Kewei Tu
110
9
0
01 May 2022
Beyond the Quadratic Approximation: the Multiscale Structure of Neural
  Network Loss Landscapes
Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes
Chao Ma
D. Kunin
Lei Wu
Lexing Ying
100
30
0
24 Apr 2022
Spectrum of inner-product kernel matrices in the polynomial regime and
  multiple descent phenomenon in kernel ridge regression
Spectrum of inner-product kernel matrices in the polynomial regime and multiple descent phenomenon in kernel ridge regression
Theodor Misiakiewicz
67
40
0
21 Apr 2022
Theory of Graph Neural Networks: Representation and Learning
Theory of Graph Neural Networks: Representation and Learning
Stefanie Jegelka
GNNAI4CE
93
70
0
16 Apr 2022
On Convergence Lemma and Convergence Stability for Piecewise Analytic
  Functions
On Convergence Lemma and Convergence Stability for Piecewise Analytic Functions
Xiaotie Deng
Hanyu Li
Ningyuan Li
66
0
0
04 Apr 2022
Convergence of gradient descent for deep neural networks
Convergence of gradient descent for deep neural networks
S. Chatterjee
ODL
77
22
0
30 Mar 2022
Random matrix analysis of deep neural network weight matrices
Random matrix analysis of deep neural network weight matrices
M. Thamm
Max Staats
B. Rosenow
76
13
0
28 Mar 2022
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks
On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks
Hongru Yang
Zhangyang Wang
MLT
106
8
0
27 Mar 2022
Modality Competition: What Makes Joint Training of Multi-modal Network
  Fail in Deep Learning? (Provably)
Modality Competition: What Makes Joint Training of Multi-modal Network Fail in Deep Learning? (Provably)
Yu Huang
Junyang Lin
Chang Zhou
Hongxia Yang
Longbo Huang
66
97
0
23 Mar 2022
On the (Non-)Robustness of Two-Layer Neural Networks in Different
  Learning Regimes
On the (Non-)Robustness of Two-Layer Neural Networks in Different Learning Regimes
Elvis Dohmatob
A. Bietti
AAML
85
13
0
22 Mar 2022
On the Generalization Mystery in Deep Learning
On the Generalization Mystery in Deep Learning
S. Chatterjee
Piotr Zielinski
OOD
77
35
0
18 Mar 2022
On the Convergence of Certified Robust Training with Interval Bound
  Propagation
On the Convergence of Certified Robust Training with Interval Bound Propagation
Yihan Wang
Zhouxing Shi
Quanquan Gu
Cho-Jui Hsieh
62
9
0
16 Mar 2022
Variational inference of fractional Brownian motion with linear
  computational complexity
Variational inference of fractional Brownian motion with linear computational complexity
Hippolyte Verdier
Franccois Laurent
Alhassan Cassé
Christian L. Vestergaard
Jean-Baptiste Masson
196
8
0
15 Mar 2022
Deep Regression Ensembles
Deep Regression Ensembles
Antoine Didisheim
Bryan Kelly
Semyon Malamud
UQCV
59
4
0
10 Mar 2022
Transition to Linearity of Wide Neural Networks is an Emerging Property
  of Assembling Weak Models
Transition to Linearity of Wide Neural Networks is an Emerging Property of Assembling Weak Models
Chaoyue Liu
Libin Zhu
M. Belkin
53
4
0
10 Mar 2022
Covariate-Balancing-Aware Interpretable Deep Learning models for
  Treatment Effect Estimation
Covariate-Balancing-Aware Interpretable Deep Learning models for Treatment Effect Estimation
Kan Chen
Qishuo Yin
Q. Long
CML
81
5
0
07 Mar 2022
The Spectral Bias of Polynomial Neural Networks
The Spectral Bias of Polynomial Neural Networks
Moulik Choraria
L. Dadi
Grigorios G. Chrysos
Julien Mairal
Volkan Cevher
92
20
0
27 Feb 2022
Sparse Neural Additive Model: Interpretable Deep Learning with Feature
  Selection via Group Sparsity
Sparse Neural Additive Model: Interpretable Deep Learning with Feature Selection via Group Sparsity
Shiyun Xu
Zhiqi Bu
Pratik Chaudhari
Ian Barnett
86
23
0
25 Feb 2022
Benefit of Interpolation in Nearest Neighbor Algorithms
Benefit of Interpolation in Nearest Neighbor Algorithms
Yue Xing
Qifan Song
Guang Cheng
91
30
0
23 Feb 2022
Previous
123...678...161718
Next