Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.08968
Cited By
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
24 December 2017
Itay Safran
Ohad Shamir
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Spurious Local Minima are Common in Two-Layer ReLU Neural Networks"
50 / 57 papers shown
Title
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
61
1
0
10 Jan 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
44
0
0
13 Nov 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
41
4
0
12 Mar 2024
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
32
11
0
03 Oct 2023
Worrisome Properties of Neural Network Controllers and Their Symbolic Representations
J. Cyranka
Kevin E. M. Church
J. Lessard
39
0
0
28 Jul 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
Yite Wang
Dawei Li
Ruoyu Sun
34
19
0
06 Apr 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
34
16
0
20 Feb 2023
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurelien Lucchi
23
13
0
19 Jan 2023
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
27
24
0
10 Nov 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
26
10
0
21 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
56
23
0
18 May 2022
Self-scalable Tanh (Stan): Faster Convergence and Better Generalization in Physics-informed Neural Networks
Raghav Gnanasambandam
Bo Shen
Jihoon Chung
Xubo Yue
Zhenyu
Zhen Kong
LRM
32
12
0
26 Apr 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
33
13
0
22 Apr 2022
Exact Solutions of a Deep Linear Network
Liu Ziyin
Botao Li
Xiangmin Meng
ODL
19
21
0
10 Feb 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Yuandong Tian
52
34
0
29 Jan 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
33
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
Hao Wu
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
41
22
0
21 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Mode connectivity in the loss landscape of parameterized quantum circuits
Kathleen E. Hamilton
E. Lynn
R. Pooser
25
3
0
09 Nov 2021
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
29
16
0
18 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
31
13
0
12 Oct 2021
Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs
Tolga Ergen
Mert Pilanci
OffRL
MLT
27
32
0
11 Oct 2021
Exponentially Many Local Minima in Quantum Neural Networks
Xuchen You
Xiaodi Wu
72
51
0
06 Oct 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of
S
n
S_n
S
n
Yossi Arjevani
M. Field
27
8
0
06 Jul 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
24
10
0
19 Mar 2021
Understanding self-supervised Learning Dynamics without Contrastive Pairs
Yuandong Tian
Xinlei Chen
Surya Ganguli
SSL
138
279
0
12 Feb 2021
The Nonconvex Geometry of Linear Inverse Problems
Armin Eftekhari
Peyman Mohajerin Esfahani
15
1
0
07 Jan 2021
Learning Graph Neural Networks with Approximate Gradient Descent
Qunwei Li
Shaofeng Zou
Leon Wenliang Zhong
GNN
32
1
0
07 Dec 2020
PAC Confidence Predictions for Deep Neural Network Classifiers
Sangdon Park
Shuo Li
Insup Lee
Osbert Bastani
UQCV
24
25
0
02 Nov 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
28
13
0
23 Mar 2020
Growing axons: greedy learning of neural networks with application to function approximation
Daria Fokina
Ivan V. Oseledets
16
18
0
28 Oct 2019
The Local Elasticity of Neural Networks
Hangfeng He
Weijie J. Su
34
44
0
15 Oct 2019
Neural ODEs as the Deep Limit of ResNets with constant weights
B. Avelin
K. Nystrom
ODL
34
31
0
28 Jun 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
35
962
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
16
93
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
22
446
0
21 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
26
1,122
0
09 Nov 2018
A Closer Look at Deep Policy Gradients
Andrew Ilyas
Logan Engstrom
Shibani Santurkar
Dimitris Tsipras
Firdaus Janoos
Larry Rudolph
Aleksander Madry
22
50
0
06 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao-quan Song
18
191
0
29 Oct 2018
Benefits of over-parameterization with EM
Ji Xu
Daniel J. Hsu
A. Maleki
30
29
0
26 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
18
117
0
17 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs
Rong Ge
Rohith Kuditipudi
Zhize Li
Xiang Wang
OOD
MLT
28
57
0
16 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
16
280
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
35
1,251
0
04 Oct 2018
Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
G. Wang
G. Giannakis
Jie Chen
MLT
24
131
0
14 Aug 2018
Model Reconstruction from Model Explanations
S. Milli
Ludwig Schmidt
Anca Dragan
Moritz Hardt
FAtt
19
177
0
13 Jul 2018
1
2
Next