Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1712.08968
Cited By
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
24 December 2017
Itay Safran
Ohad Shamir
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Spurious Local Minima are Common in Two-Layer ReLU Neural Networks"
50 / 62 papers shown
Title
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input
Ziang Chen
Rong Ge
MLT
61
1
0
10 Jan 2025
Learning Gaussian Multi-Index Models with Gradient Flow: Time Complexity and Directional Convergence
Berfin Simsek
Amire Bendjeddou
Daniel Hsu
44
0
0
13 Nov 2024
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
41
4
0
12 Mar 2024
How Over-Parameterization Slows Down Gradient Descent in Matrix Sensing: The Curses of Symmetry and Initialization
Nuoya Xiong
Lijun Ding
Simon S. Du
35
11
0
03 Oct 2023
Worrisome Properties of Neural Network Controllers and Their Symbolic Representations
J. Cyranka
Kevin E. M. Church
J. Lessard
39
0
0
28 Jul 2023
NTK-SAP: Improving neural network pruning by aligning training dynamics
Yite Wang
Dawei Li
Ruoyu Sun
34
19
0
06 Apr 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
37
16
0
20 Feb 2023
An SDE for Modeling SAM: Theory and Insights
Enea Monzio Compagnoni
Luca Biggio
Antonio Orvieto
F. Proske
Hans Kersting
Aurelien Lucchi
23
13
0
19 Jan 2023
Regression as Classification: Influence of Task Formulation on Neural Network Features
Lawrence Stewart
Francis R. Bach
Quentin Berthet
Jean-Philippe Vert
27
24
0
10 Nov 2022
When Expressivity Meets Trainability: Fewer than
n
n
n
Neurons Can Work
Jiawei Zhang
Yushun Zhang
Mingyi Hong
Ruoyu Sun
Zhi-Quan Luo
26
10
0
21 Oct 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
16
8
0
12 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition
Jianhao Ma
Li-Zhen Guo
S. Fattahi
38
4
0
01 Oct 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias
Itay Safran
Gal Vardi
Jason D. Lee
MLT
59
23
0
18 May 2022
Self-scalable Tanh (Stan): Faster Convergence and Better Generalization in Physics-informed Neural Networks
Raghav Gnanasambandam
Bo Shen
Jihoon Chung
Xubo Yue
Zhenyu
Zhen Kong
LRM
34
12
0
26 Apr 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees
Zhengdao Chen
Eric Vanden-Eijnden
Joan Bruna
MLT
36
13
0
22 Apr 2022
Exact Solutions of a Deep Linear Network
Liu Ziyin
Botao Li
Xiangmin Meng
ODL
19
21
0
10 Feb 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Yuandong Tian
52
34
0
29 Jan 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks
Bartlomiej Polaczyk
J. Cyranka
ODL
33
3
0
28 Jan 2022
How does unlabeled data improve generalization in self-training? A one-hidden-layer theoretical analysis
Shuai Zhang
Hao Wu
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
SSL
MLT
41
22
0
21 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
Mode connectivity in the loss landscape of parameterized quantum circuits
Kathleen E. Hamilton
E. Lynn
R. Pooser
25
3
0
09 Nov 2021
Path Regularization: A Convexity and Sparsity Inducing Regularization for Parallel ReLU Networks
Tolga Ergen
Mert Pilanci
32
16
0
18 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCV
MLT
31
13
0
12 Oct 2021
Global Optimality Beyond Two Layers: Training Deep ReLU Networks via Convex Programs
Tolga Ergen
Mert Pilanci
OffRL
MLT
29
32
0
11 Oct 2021
Exponentially Many Local Minima in Quantum Neural Networks
Xuchen You
Xiaodi Wu
72
51
0
06 Oct 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
28
18
0
21 Jul 2021
Equivariant bifurcation, quadratic equivariants, and symmetry breaking for the standard representation of
S
n
S_n
S
n
Yossi Arjevani
M. Field
27
8
0
06 Jul 2021
A Geometric Analysis of Neural Collapse with Unconstrained Features
Zhihui Zhu
Tianyu Ding
Jinxin Zhou
Xiao Li
Chong You
Jeremias Sulam
Qing Qu
24
194
0
06 May 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
24
10
0
19 Mar 2021
Understanding self-supervised Learning Dynamics without Contrastive Pairs
Yuandong Tian
Xinlei Chen
Surya Ganguli
SSL
138
281
0
12 Feb 2021
A Convergence Theory Towards Practical Over-parameterized Deep Neural Networks
Asaf Noy
Yi Tian Xu
Y. Aflalo
Lihi Zelnik-Manor
R. L. Jin
31
3
0
12 Jan 2021
The Nonconvex Geometry of Linear Inverse Problems
Armin Eftekhari
Peyman Mohajerin Esfahani
20
1
0
07 Jan 2021
Learning Graph Neural Networks with Approximate Gradient Descent
Qunwei Li
Shaofeng Zou
Leon Wenliang Zhong
GNN
32
1
0
07 Dec 2020
PAC Confidence Predictions for Deep Neural Network Classifiers
Sangdon Park
Shuo Li
Insup Lee
Osbert Bastani
UQCV
24
25
0
02 Nov 2020
Towards a Mathematical Understanding of Neural Network-Based Machine Learning: what we know and what we don't
E. Weinan
Chao Ma
Stephan Wojtowytsch
Lei Wu
AI4CE
22
133
0
22 Sep 2020
Non-convergence of stochastic gradient descent in the training of deep neural networks
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
14
37
0
12 Jun 2020
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
31
13
0
23 Mar 2020
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
16
168
0
19 Dec 2019
Growing axons: greedy learning of neural networks with application to function approximation
Daria Fokina
Ivan V. Oseledets
18
18
0
28 Oct 2019
The Local Elasticity of Neural Networks
Hangfeng He
Weijie J. Su
40
44
0
15 Oct 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
34
51
0
24 Jul 2019
Neural ODEs as the Deep Limit of ResNets with constant weights
B. Avelin
K. Nystrom
ODL
37
31
0
28 Jun 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
35
962
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
16
93
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
24
446
0
21 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
M. Tomizuka
ODL
35
1,122
0
09 Nov 2018
A Closer Look at Deep Policy Gradients
Andrew Ilyas
Logan Engstrom
Shibani Santurkar
Dimitris Tsipras
Firdaus Janoos
Larry Rudolph
Aleksander Madry
22
50
0
06 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao-quan Song
18
191
0
29 Oct 2018
Benefits of over-parameterization with EM
Ji Xu
Daniel J. Hsu
A. Maleki
32
29
0
26 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
18
117
0
17 Oct 2018
1
2
Next