Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1810.02054
Cited By
v1
v2 (latest)
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
4 October 2018
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Provably Optimizes Over-parameterized Neural Networks"
50 / 882 papers shown
Title
Competition analysis on the over-the-counter credit default swap market
L. Abraham
51
1
0
03 Dec 2020
Neural Contextual Bandits with Deep Representation and Shallow Exploration
Pan Xu
Zheng Wen
Handong Zhao
Quanquan Gu
OffRL
89
78
0
03 Dec 2020
On Generalization of Adaptive Methods for Over-parameterized Linear Regression
Vatsal Shah
Soumya Basu
Anastasios Kyrillidis
Sujay Sanghavi
AI4CE
59
4
0
28 Nov 2020
Neural collapse with unconstrained features
D. Mixon
Hans Parshall
Jianzong Pi
82
121
0
23 Nov 2020
Metric Transforms and Low Rank Matrices via Representation Theory of the Real Hyperrectangle
Josh Alman
T. Chu
Gary Miller
Shyam Narayanan
Mark Sellke
Zhao Song
38
1
0
23 Nov 2020
Normalization effects on shallow neural networks and related asymptotic expansions
Jiahui Yu
K. Spiliopoulos
53
6
0
20 Nov 2020
Gradient Starvation: A Learning Proclivity in Neural Networks
Mohammad Pezeshki
Sekouba Kaba
Yoshua Bengio
Aaron Courville
Doina Precup
Guillaume Lajoie
MLT
160
269
0
18 Nov 2020
Towards NNGP-guided Neural Architecture Search
Daniel S. Park
Jaehoon Lee
Daiyi Peng
Yuan Cao
Jascha Narain Sohl-Dickstein
BDL
71
34
0
11 Nov 2020
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee
Tengyu Xu
Yingbin Liang
Guanghui Lan
102
128
0
11 Nov 2020
On Function Approximation in Reinforcement Learning: Optimism in the Face of Large State Spaces
Zhuoran Yang
Chi Jin
Zhaoran Wang
Mengdi Wang
Michael I. Jordan
97
18
0
09 Nov 2020
Algorithms and Hardness for Linear Algebra on Geometric Graphs
Josh Alman
T. Chu
Aaron Schild
Zhao Song
120
30
0
04 Nov 2020
Which Minimizer Does My Neural Network Converge To?
Manuel Nonnenmacher
David Reeb
Ingo Steinwart
ODL
32
4
0
04 Nov 2020
Federated Knowledge Distillation
Hyowoon Seo
Jihong Park
Seungeun Oh
M. Bennis
Seong-Lyun Kim
FedML
101
92
0
04 Nov 2020
DebiNet: Debiasing Linear Models with Nonlinear Overparameterized Neural Networks
Shiyun Xu
Zhiqi Bu
15
1
0
01 Nov 2020
Over-parametrized neural networks as under-determined linear systems
Austin R. Benson
Anil Damle
Alex Townsend
16
0
0
29 Oct 2020
Are wider nets better given the same number of parameters?
A. Golubeva
Behnam Neyshabur
Guy Gur-Ari
112
44
0
27 Oct 2020
Neural Network Approximation: Three Hidden Layers Are Enough
Zuowei Shen
Haizhao Yang
Shijun Zhang
139
121
0
25 Oct 2020
A Dynamical View on Optimization Algorithms of Overparameterized Neural Networks
Zhiqi Bu
Shiyun Xu
Kan Chen
70
18
0
25 Oct 2020
On Convergence and Generalization of Dropout Training
Poorya Mianjy
R. Arora
132
30
0
23 Oct 2020
An Investigation of how Label Smoothing Affects Generalization
Blair Chen
Liu Ziyin
Zihao Wang
Paul Pu Liang
UQCV
92
18
0
23 Oct 2020
Global optimality of softmax policy gradient with single hidden layer neural networks in the mean-field regime
Andrea Agazzi
Jianfeng Lu
89
16
0
22 Oct 2020
Deep Learning is Singular, and That's Good
Daniel Murfet
Susan Wei
Biwei Huang
Hui Li
Jesse Gell-Redman
T. Quella
UQCV
79
29
0
22 Oct 2020
MixCon: Adjusting the Separability of Data Representations for Harder Data Recovery
Xiaoxiao Li
Yangsibo Huang
Binghui Peng
Zhao Song
Keqin Li
MIACV
74
1
0
22 Oct 2020
Beyond Lazy Training for Over-parameterized Tensor Decomposition
Xiang Wang
Chenwei Wu
Jason D. Lee
Tengyu Ma
Rong Ge
91
14
0
22 Oct 2020
PHEW: Constructing Sparse Networks that Learn Fast and Generalize Well without Training Data
S. M. Patil
C. Dovrolis
75
18
0
22 Oct 2020
Towards Understanding the Dynamics of the First-Order Adversaries
Zhun Deng
Hangfeng He
Jiaoyang Huang
Weijie J. Su
AAML
54
11
0
20 Oct 2020
Deep Reinforcement Learning for Adaptive Network Slicing in 5G for Intelligent Vehicular Systems and Smart Cities
A. Nassar
Y. Yilmaz
AI4CE
42
60
0
19 Oct 2020
Adaptive Dense-to-Sparse Paradigm for Pruning Online Recommendation System with Non-Stationary Data
Mao Ye
Dhruv Choudhary
Jiecao Yu
Ellie Wen
Zeliang Chen
Jiyan Yang
Jongsoo Park
Qiang Liu
A. Kejariwal
76
9
0
16 Oct 2020
Temperature check: theory and practice for training models with softmax-cross-entropy losses
Atish Agarwala
Jeffrey Pennington
Yann N. Dauphin
S. Schoenholz
UQCV
67
34
0
14 Oct 2020
How does Weight Correlation Affect the Generalisation Ability of Deep Neural Networks
Gao Jin
Xinping Yi
Liang Zhang
Lijun Zhang
S. Schewe
Xiaowei Huang
83
42
0
12 Oct 2020
A Theoretical Analysis of Catastrophic Forgetting through the NTK Overlap Matrix
T. Doan
Mehdi Abbana Bennani
Bogdan Mazoure
Guillaume Rabusseau
Pierre Alquier
CLL
88
86
0
07 Oct 2020
Constraining Logits by Bounded Function for Adversarial Robustness
Sekitoshi Kanai
Masanori Yamada
Shin'ya Yamaguchi
Hiroshi Takahashi
Yasutoshi Ida
AAML
33
4
0
06 Oct 2020
A Unifying View on Implicit Bias in Training Linear Neural Networks
Chulhee Yun
Shankar Krishnan
H. Mobahi
MLT
125
82
0
06 Oct 2020
Understanding How Over-Parametrization Leads to Acceleration: A case of learning a single teacher neuron
Jun-Kun Wang
Jacob D. Abernethy
40
1
0
04 Oct 2020
A Modular Analysis of Provable Acceleration via Polyak's Momentum: Training a Wide ReLU Network and a Deep Linear Network
Jun-Kun Wang
Chi-Heng Lin
Jacob D. Abernethy
76
24
0
04 Oct 2020
Computational Separation Between Convolutional and Fully-Connected Networks
Eran Malach
Shai Shalev-Shwartz
90
26
0
03 Oct 2020
WeMix: How to Better Utilize Data Augmentation
Yi Tian Xu
Asaf Noy
Ming Lin
Qi Qian
Hao Li
Rong Jin
84
16
0
03 Oct 2020
Interpreting Robust Optimization via Adversarial Influence Functions
Zhun Deng
Cynthia Dwork
Jialiang Wang
Linjun Zhang
TDI
49
12
0
03 Oct 2020
On the linearity of large non-linear models: when and why the tangent kernel is constant
Chaoyue Liu
Libin Zhu
M. Belkin
169
143
0
02 Oct 2020
Optimization Landscapes of Wide Deep Neural Networks Are Benign
Johannes Lederer
99
8
0
02 Oct 2020
Neural Thompson Sampling
Weitong Zhang
Dongruo Zhou
Lihong Li
Quanquan Gu
87
122
0
02 Oct 2020
Why Adversarial Interaction Creates Non-Homogeneous Patterns: A Pseudo-Reaction-Diffusion Model for Turing Instability
Litu Rout
AAML
34
1
0
01 Oct 2020
Deep Equals Shallow for ReLU Networks in Kernel Regimes
A. Bietti
Francis R. Bach
113
90
0
30 Sep 2020
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks
Keyulu Xu
Mozhi Zhang
Jingling Li
S. Du
Ken-ichi Kawarabayashi
Stefanie Jegelka
MLT
184
313
0
24 Sep 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
125
134
0
22 Sep 2020
Sanity-Checking Pruning Methods: Random Tickets can Win the Jackpot
Jingtong Su
Yihang Chen
Tianle Cai
Tianhao Wu
Ruiqi Gao
Liwei Wang
Jason D. Lee
73
86
0
22 Sep 2020
Tensor Programs III: Neural Matrix Laws
Greg Yang
79
48
0
22 Sep 2020
Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS
Lin Chen
Sheng Xu
196
94
0
22 Sep 2020
Generalized Leverage Score Sampling for Neural Networks
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
71
43
0
21 Sep 2020
GraphNorm: A Principled Approach to Accelerating Graph Neural Network Training
Tianle Cai
Shengjie Luo
Keyulu Xu
Di He
Tie-Yan Liu
Liwei Wang
GNN
108
167
0
07 Sep 2020
Previous
1
2
3
...
11
12
13
...
16
17
18
Next