ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1810.02054
  4. Cited By
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
v1v2 (latest)

Gradient Descent Provably Optimizes Over-parameterized Neural Networks

4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
    MLTODL
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Provably Optimizes Over-parameterized Neural Networks"

50 / 882 papers shown
Title
Competition analysis on the over-the-counter credit default swap market
Competition analysis on the over-the-counter credit default swap market
L. Abraham
51
1
0
03 Dec 2020
Neural Contextual Bandits with Deep Representation and Shallow
  Exploration
Neural Contextual Bandits with Deep Representation and Shallow Exploration
Pan Xu
Zheng Wen
Handong Zhao
Quanquan Gu
OffRL
89
78
0
03 Dec 2020
On Generalization of Adaptive Methods for Over-parameterized Linear
  Regression
On Generalization of Adaptive Methods for Over-parameterized Linear Regression
Vatsal Shah
Soumya Basu
Anastasios Kyrillidis
Sujay Sanghavi
AI4CE
59
4
0
28 Nov 2020
Neural collapse with unconstrained features
Neural collapse with unconstrained features
D. Mixon
Hans Parshall
Jianzong Pi
82
121
0
23 Nov 2020
Metric Transforms and Low Rank Matrices via Representation Theory of the
  Real Hyperrectangle
Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle
Josh Alman
T. Chu
Gary Miller
Shyam Narayanan
Mark Sellke
Zhao Song
38
1
0
23 Nov 2020
Normalization effects on shallow neural networks and related asymptotic
  expansions
Normalization effects on shallow neural networks and related asymptotic expansions
Jiahui Yu
K. Spiliopoulos
53
6
0
20 Nov 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
160
269
0
18 Nov 2020
Towards NNGP-guided Neural Architecture Search
Towards NNGP-guided Neural Architecture Search
Daniel S. Park
Jaehoon Lee
Daiyi Peng
Yuan Cao
Jascha Narain Sohl-Dickstein
BDL
71
34
0
11 Nov 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence
  Guarantee
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
102
128
0
11 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the
  Face of Large State Spaces
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
97
18
0
09 Nov 2020
Algorithms and Hardness for Linear Algebra on Geometric Graphs
Algorithms and Hardness for Linear Algebra on Geometric Graphs
Josh Alman
T. Chu
Aaron Schild
Zhao Song
120
30
0
04 Nov 2020
Which Minimizer Does My Neural Network Converge To?
Which Minimizer Does My Neural Network Converge To?
Manuel Nonnenmacher
David Reeb
Ingo Steinwart
ODL
32
4
0
04 Nov 2020
Federated Knowledge Distillation
Federated Knowledge Distillation
Hyowoon Seo
Jihong Park
Seungeun Oh
M. Bennis
Seong-Lyun Kim
FedML
101
92
0
04 Nov 2020
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural
  Networks
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks
Shiyun Xu
Zhiqi Bu
15
1
0
01 Nov 2020
Over-parametrized neural networks as under-determined linear systems
Over-parametrized neural networks as under-determined linear systems
Austin R. Benson
Anil Damle
Alex Townsend
16
0
0
29 Oct 2020
Are wider nets better given the same number of parameters?
Are wider nets better given the same number of parameters?
A. Golubeva
Behnam Neyshabur
Guy Gur-Ari
112
44
0
27 Oct 2020
Neural Network Approximation: Three Hidden Layers Are Enough
Neural Network Approximation: Three Hidden Layers Are Enough
Zuowei Shen
Haizhao Yang
Shijun Zhang
139
121
0
25 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural
  Networks
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
70
18
0
25 Oct 2020
On Convergence and Generalization of Dropout Training
On Convergence and Generalization of Dropout Training
Poorya Mianjy
R. Arora
132
30
0
23 Oct 2020
An Investigation of how Label Smoothing Affects Generalization
An Investigation of how Label Smoothing Affects Generalization
Blair Chen
Liu Ziyin
Zihao Wang
Paul Pu Liang
UQCV
92
18
0
23 Oct 2020
Global optimality of softmax policy gradient with single hidden layer
  neural networks in the mean-field regime
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Andrea Agazzi
Jianfeng Lu
89
16
0
22 Oct 2020
Deep Learning is Singular, and That's Good
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Biwei Huang
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
79
29
0
22 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder
  Data Recovery
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery
Xiaoxiao Li
Yangsibo Huang
Binghui Peng
Zhao Song
Keqin Li
MIACV
74
1
0
22 Oct 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition
Beyond Lazy Training for Over-parameterized Tensor Decomposition
Xiang Wang
Chenwei Wu
Jason D. Lee
Tengyu Ma
Rong Ge
91
14
0
22 Oct 2020
PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well
  without Training Data
PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data
S. M. Patil
C. Dovrolis
75
18
0
22 Oct 2020
Towards Understanding the Dynamics of the First-Order Adversaries
Towards Understanding the Dynamics of the First-Order Adversaries
Zhun Deng
Hangfeng He
Jiaoyang Huang
Weijie J. Su
AAML
54
11
0
20 Oct 2020
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for
  Intelligent Vehicular Systems and Smart Cities
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities
A. Nassar
Y. Yilmaz
AI4CE
42
60
0
19 Oct 2020
Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation
  System with Non-Stationary Data
Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data
Mao Ye
Dhruv Choudhary
Jiecao Yu
Ellie Wen
Zeliang Chen
Jiyan Yang
Jongsoo Park
Qiang Liu
A. Kejariwal
76
9
0
16 Oct 2020
Temperature check: theory and practice for training models with
  softmax-cross-entropy losses
Temperature check: theory and practice for training models with softmax-cross-entropy losses
Atish Agarwala
Jeffrey Pennington
Yann N. Dauphin
S. Schoenholz
UQCV
67
34
0
14 Oct 2020
How does Weight Correlation Affect the Generalisation Ability of Deep
  Neural Networks
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks
Gao Jin
Xinping Yi
Liang Zhang
Lijun Zhang
S. Schewe
Xiaowei Huang
83
42
0
12 Oct 2020
A Theoretical Analysis of Catastrophic Forgetting through the NTK
  Overlap Matrix
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix
T. Doan
Mehdi Abbana Bennani
Bogdan Mazoure
Guillaume Rabusseau
Pierre Alquier
CLL
88
86
0
07 Oct 2020
Constraining Logits by Bounded Function for Adversarial Robustness
Constraining Logits by Bounded Function for Adversarial Robustness
Sekitoshi Kanai
Masanori Yamada
Shin'ya Yamaguchi
Hiroshi Takahashi
Yasutoshi Ida
AAML
33
4
0
06 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
125
82
0
06 Oct 2020
Understanding How Over-Parametrization Leads to Acceleration: A case of
  learning a single teacher neuron
Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuron
Jun-Kun Wang
Jacob D. Abernethy
40
1
0
04 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum:
  Training a Wide ReLU Network and a Deep Linear Network
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
76
24
0
04 Oct 2020
Computational Separation Between Convolutional and Fully-Connected
  Networks
Computational Separation Between Convolutional and Fully-Connected Networks
Eran Malach
Shai Shalev-Shwartz
90
26
0
03 Oct 2020
WeMix: How to Better Utilize Data Augmentation
WeMix: How to Better Utilize Data Augmentation
Yi Tian Xu
Asaf Noy
Ming Lin
Qi Qian
Hao Li
Rong Jin
84
16
0
03 Oct 2020
Interpreting Robust Optimization via Adversarial Influence Functions
Interpreting Robust Optimization via Adversarial Influence Functions
Zhun Deng
Cynthia Dwork
Jialiang Wang
Linjun Zhang
TDI
49
12
0
03 Oct 2020
On the linearity of large non-linear models: when and why the tangent
  kernel is constant
On the linearity of large non-linear models: when and why the tangent kernel is constant
Chaoyue Liu
Libin Zhu
M. Belkin
169
143
0
02 Oct 2020
Optimization Landscapes of Wide Deep Neural Networks Are Benign
Optimization Landscapes of Wide Deep Neural Networks Are Benign
Johannes Lederer
99
8
0
02 Oct 2020
Neural Thompson Sampling
Neural Thompson Sampling
Weitong Zhang
Dongruo Zhou
Lihong Li
Quanquan Gu
87
122
0
02 Oct 2020
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A
  Pseudo-Reaction-Diffusion Model for Turing Instability
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A Pseudo-Reaction-Diffusion Model for Turing Instability
Litu Rout
AAML
34
1
0
01 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
113
90
0
30 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural
  Networks
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
184
313
0
24 Sep 2020
Towards a Mathematical Understanding of Neural Network-Based Machine
  Learning: what we know and what we don't
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
125
134
0
22 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Jingtong Su
Yihang Chen
Tianle Cai
Tianhao Wu
Ruiqi Gao
Liwei Wang
Jason D. Lee
73
86
0
22 Sep 2020
Tensor Programs III: Neural Matrix Laws
Tensor Programs III: Neural Matrix Laws
Greg Yang
79
48
0
22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Lin Chen
Sheng Xu
196
94
0
22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
Generalized Leverage Score Sampling for Neural Networks
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
71
43
0
21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network
  Training
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
108
167
0
07 Sep 2020
Previous
123...111213...161718
Next