Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2007.04596
Cited By
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
9 July 2020
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK"
50 / 56 papers shown
Title
Shape Matters: Understanding the Implicit Bias of the Noise Covariance
Jeff Z. HaoChen
Colin Wei
Jason D. Lee
Tengyu Ma
101
94
0
15 Jun 2020
Feature Purification: How Adversarial Training Performs Robust Deep Learning
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
AAML
61
149
0
20 May 2020
Making Method of Moments Great Again? -- How can GANs learn distributions
Yuanzhi Li
Zehao Dou
GAN
22
5
0
09 Mar 2020
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
70
168
0
19 Dec 2019
Enhanced Convolutional Neural Tangent Kernels
Zhiyuan Li
Ruosong Wang
Dingli Yu
S. Du
Wei Hu
Ruslan Salakhutdinov
Sanjeev Arora
60
131
0
03 Nov 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Yu Bai
Jason D. Lee
45
116
0
03 Oct 2019
Finite Depth and Width Corrections to the Neural Tangent Kernel
Boris Hanin
Mihai Nica
MDE
56
150
0
13 Sep 2019
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
39
295
0
10 Jul 2019
Generalization Bounds of Stochastic Gradient Descent for Wide and Deep Neural Networks
Yuan Cao
Quanquan Gu
MLT
AI4CE
70
384
0
30 May 2019
What Can ResNet Learn Efficiently, Going Beyond Kernels?
Zeyuan Allen-Zhu
Yuanzhi Li
198
183
0
24 May 2019
Linearized two-layers neural networks in high dimension
Behrooz Ghorbani
Song Mei
Theodor Misiakiewicz
Andrea Montanari
MLT
45
243
0
27 Apr 2019
On Exact Computation with an Infinitely Wide Neural Net
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruslan Salakhutdinov
Ruosong Wang
163
915
0
26 Apr 2019
On the Power and Limitations of Random Features for Understanding Neural Networks
Gilad Yehudai
Ohad Shamir
MLT
53
182
0
01 Apr 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Greg Yang
96
286
0
13 Feb 2019
Towards moderate overparameterization: global convergence guarantees for training shallow neural networks
Samet Oymak
Mahdi Soltanolkotabi
43
320
0
12 Feb 2019
Can SGD Learn Recurrent Neural Networks with Provable Generalization?
Zeyuan Allen-Zhu
Yuanzhi Li
MLT
LRM
88
57
0
04 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
144
966
0
24 Jan 2019
On Lazy Training in Differentiable Programming
Lénaïc Chizat
Edouard Oyallon
Francis R. Bach
92
823
0
19 Dec 2018
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
135
448
0
21 Nov 2018
Learning and Generalization in Overparameterized Neural Networks, Going Beyond Two Layers
Zeyuan Allen-Zhu
Yuanzhi Li
Yingyu Liang
MLT
138
769
0
12 Nov 2018
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CE
ODL
192
1,457
0
09 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
Jason D. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
145
1,133
0
09 Nov 2018
Learning Two Layer Rectified Neural Networks in Polynomial Time
Ainesh Bakshi
Rajesh Jayaram
David P. Woodruff
NoLa
109
69
0
05 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
121
191
0
29 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs
Rong Ge
Rohith Kuditipudi
Zhize Li
Xiang Wang
OOD
MLT
121
57
0
16 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
Jason D. Lee
Qiang Liu
Tengyu Ma
136
245
0
12 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
157
1,261
0
04 Oct 2018
Learning Overparameterized Neural Networks via Stochastic Gradient Descent on Structured Data
Yuanzhi Li
Yingyu Liang
MLT
144
652
0
03 Aug 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
75
134
0
20 Jun 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
198
3,160
0
20 Jun 2018
On the Global Convergence of Gradient Descent for Over-parameterized Models using Optimal Transport
Lénaïc Chizat
Francis R. Bach
OT
157
731
0
24 May 2018
Gradient Descent for One-Hidden-Layer Neural Networks: Polynomial Convergence and SQ Lower Bounds
Santosh Vempala
John Wilmes
MLT
69
50
0
07 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
76
855
0
18 Apr 2018
An Alternative View: When Does SGD Escape Local Minima?
Robert D. Kleinberg
Yuanzhi Li
Yang Yuan
MLT
57
316
0
17 Feb 2018
Deep contextualized word representations
Matthew E. Peters
Mark Neumann
Mohit Iyyer
Matt Gardner
Christopher Clark
Kenton Lee
Luke Zettlemoyer
NAI
118
11,520
0
15 Feb 2018
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
S. Du
Jason D. Lee
Yuandong Tian
Barnabás Póczós
Aarti Singh
MLT
128
236
0
03 Dec 2017
Learning One-hidden-layer Neural Networks with Landscape Design
Rong Ge
Jason D. Lee
Tengyu Ma
MLT
135
260
0
01 Nov 2017
Theoretical properties of the global optimizer of two layer neural network
Digvijay Boob
Guanghui Lan
MLT
93
34
0
30 Oct 2017
First-order Methods Almost Always Avoid Saddle Points
Jason D. Lee
Ioannis Panageas
Georgios Piliouras
Max Simchowitz
Michael I. Jordan
Benjamin Recht
ODL
116
83
0
20 Oct 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
148
1,093
0
07 Aug 2017
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
Jason D. Lee
118
417
0
16 Jul 2017
On the Optimization Landscape of Tensor Decompositions
Rong Ge
Tengyu Ma
86
91
0
18 Jun 2017
Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
Yuanzhi Li
Yingyu Liang
88
17
0
13 Jun 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
120
336
0
10 Jun 2017
Convergence Analysis of Two-layer Neural Networks with ReLU Activation
Yuanzhi Li
Yang Yuan
MLT
106
650
0
28 May 2017
Convolutional Sequence to Sequence Learning
Jonas Gehring
Michael Auli
David Grangier
Denis Yarats
Yann N. Dauphin
AIMat
122
3,279
0
08 May 2017
An Analytical Formula of Population Gradient for two-layered ReLU network and its Applications in Convergence and Critical Point Analysis
Yuandong Tian
MLT
136
216
0
02 Mar 2017
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
Alon Brutzkus
Amir Globerson
MLT
115
313
0
26 Feb 2017
Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
Yuanzhi Li
Yingyu Liang
Andrej Risteski
108
30
0
11 Nov 2016
Diverse Neural Network Learns True Target Functions
Bo Xie
Yingyu Liang
Le Song
127
137
0
09 Nov 2016
1
2
Next