Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.00779
Cited By
v1
v2 (latest)
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima
3 December 2017
S. Du
Jason D. Lee
Yuandong Tian
Barnabás Póczós
Aarti Singh
MLT
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima"
50 / 102 papers shown
Title
Stepsize anything: A unified learning rate schedule for budgeted-iteration training
Anda Tang
Yiming Dong
Yutao Zeng
zhou Xun
Zhouchen Lin
380
0
0
30 May 2025
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Michael Kohler
A. Krzyżak
Benjamin Walter
95
1
0
13 May 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
107
4
0
12 Mar 2024
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
115
79
0
25 May 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
108
16
0
20 Feb 2023
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
153
26
0
29 Dec 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
62
6
0
08 Nov 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
65
8
0
12 Oct 2022
Explicitising The Implicit Intrepretability of Deep Neural Networks Via Duality
Chandrashekar Lakshminarayanan
Ashutosh Kumar Singh
A. Rajkumar
AI4CE
82
1
0
01 Mar 2022
Benign Overfitting in Two-layer Convolutional Neural Networks
Yuan Cao
Zixiang Chen
M. Belkin
Quanquan Gu
MLT
98
90
0
14 Feb 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Yuandong Tian
186
35
0
29 Jan 2022
Generalization Performance of Empirical Risk Minimization on Over-parameterized Deep ReLU Nets
Shao-Bo Lin
Yao Wang
Ding-Xuan Zhou
ODL
92
6
0
28 Nov 2021
Mode connectivity in the loss landscape of parameterized quantum circuits
Kathleen E. Hamilton
E. Lynn
R. Pooser
72
3
0
09 Nov 2021
GradSign: Model Performance Inference with Theoretical Insights
Zhihao Zhang
Zhihao Jia
82
24
0
16 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
89
13
0
12 Oct 2021
Constants of Motion: The Antidote to Chaos in Optimization and Game Dynamics
Georgios Piliouras
Xiao Wang
69
0
0
08 Sep 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
72
19
0
21 Jul 2021
Continual Learning in the Teacher-Student Setup: Impact of Task Similarity
Sebastian Lee
Sebastian Goldt
Andrew M. Saxe
CLL
88
75
0
09 Jul 2021
Neural Active Learning with Performance Guarantees
Pranjal Awasthi
Christoph Dann
Claudio Gentile
Ayush Sekhari
Zhilei Wang
56
22
0
06 Jun 2021
From Local Pseudorandom Generators to Hardness of Learning
Amit Daniely
Gal Vardi
132
32
0
20 Jan 2021
Learning Graph Neural Networks with Approximate Gradient Descent
Qunwei Li
Shaofeng Zou
Leon Wenliang Zhong
GNN
107
1
0
07 Dec 2020
Align, then memorise: the dynamics of learning with feedback alignment
Maria Refinetti
Stéphane dÁscoli
Ruben Ohana
Sebastian Goldt
107
37
0
24 Nov 2020
Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets
Depen Morwani
H. G. Ramaswamy
56
3
0
24 Oct 2020
Computational Separation Between Convolutional and Fully-Connected Networks
Eran Malach
Shai Shalev-Shwartz
95
26
0
03 Oct 2020
Generalized Leverage Score Sampling for Neural Networks
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
79
43
0
21 Sep 2020
Nonparametric Learning of Two-Layer ReLU Residual Units
Zhunxuan Wang
Linyun He
Chunchuan Lyu
Shay B. Cohen
MLT
OffRL
204
1
0
17 Aug 2020
Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry
Yossi Arjevani
M. Field
55
16
0
04 Aug 2020
From Boltzmann Machines to Neural Networks and Back Again
Surbhi Goel
Adam R. Klivans
Frederic Koehler
51
5
0
25 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
95
27
0
09 Jul 2020
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
Yuqing Li
Yaoyu Zhang
N. Yip
55
5
0
07 Jul 2020
Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions
Stefano Sarao Mannelli
Eric Vanden-Eijnden
Lenka Zdeborová
AI4CE
79
49
0
27 Jun 2020
The Gaussian equivalence of generative models for learning with shallow neural networks
Sebastian Goldt
Bruno Loureiro
Galen Reeves
Florent Krzakala
M. Mézard
Lenka Zdeborová
BDL
114
107
0
25 Jun 2020
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
MLT
AI4CE
120
34
0
25 Jun 2020
Hardness of Learning Neural Networks with Natural Weights
Amit Daniely
Gal Vardi
77
19
0
05 Jun 2020
Understanding and Improving Information Transfer in Multi-Task Learning
Sen Wu
Hongyang R. Zhang
Christopher Ré
80
158
0
02 May 2020
Piecewise linear activations substantially shape the loss surfaces of neural networks
Fengxiang He
Bohan Wang
Dacheng Tao
ODL
93
30
0
27 Mar 2020
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
117
13
0
23 Mar 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLT
AI4CE
59
4
0
22 Feb 2020
Replica Exchange for Non-Convex Optimization
Jing-rong Dong
Xin T. Tong
110
21
0
23 Jan 2020
Thresholds of descending algorithms in inference problems
Stefano Sarao Mannelli
Lenka Zdeborova
AI4CE
71
4
0
02 Jan 2020
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
137
169
0
19 Dec 2019
Naive Gabor Networks for Hyperspectral Image Classification
Chenying Liu
Jun Li
Lin He
Antonio J. Plaza
Shutao Li
Bo Li
77
45
0
09 Dec 2019
Over-parametrized deep neural networks do not generalize well
Michael Kohler
A. Krzyżak
58
12
0
09 Dec 2019
Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks
Yuan Cao
Quanquan Gu
MLT
85
19
0
12 Nov 2019
Towards Understanding the Importance of Shortcut Connections in Residual Networks
Tianyi Liu
Minshuo Chen
Mo Zhou
S. Du
Enlu Zhou
T. Zhao
60
45
0
10 Sep 2019
Towards Understanding the Importance of Noise in Training Neural Networks
Mo Zhou
Tianyi Liu
Yan Li
Dachao Lin
Enlu Zhou
T. Zhao
MLT
92
26
0
07 Sep 2019
Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization
T. Poggio
Andrzej Banburski
Q. Liao
ODL
128
165
0
25 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
97
52
0
24 Jul 2019
Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model
Stefano Sarao Mannelli
Giulio Biroli
C. Cammarota
Florent Krzakala
Lenka Zdeborová
62
43
0
18 Jul 2019
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
101
300
0
10 Jul 2019
1
2
3
Next