ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1712.00779
  4. Cited By
Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of
  Spurious Local Minima
v1v2 (latest)

Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima

3 December 2017
S. Du
Jason D. Lee
Yuandong Tian
Barnabás Póczós
Aarti Singh
    MLT
ArXiv (abs)PDFHTML

Papers citing "Gradient Descent Learns One-hidden-layer CNN: Don't be Afraid of Spurious Local Minima"

50 / 102 papers shown
Title
Stepsize anything: A unified learning rate schedule for budgeted-iteration training
Stepsize anything: A unified learning rate schedule for budgeted-iteration training
Anda Tang
Yiming Dong
Yutao Zeng
zhou Xun
Zhouchen Lin
380
0
0
30 May 2025
Analysis of the rate of convergence of an over-parametrized
  convolutional neural network image classifier learned by gradient descent
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent
Michael Kohler
A. Krzyżak
Benjamin Walter
95
1
0
13 May 2024
How does promoting the minority fraction affect generalization? A
  theoretical study of the one-hidden-layer neural network on group imbalance
How does promoting the minority fraction affect generalization? A theoretical study of the one-hidden-layer neural network on group imbalance
Hongkang Li
Shuai Zhang
Yihua Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
107
4
0
12 Mar 2024
Scan and Snap: Understanding Training Dynamics and Token Composition in
  1-layer Transformer
Scan and Snap: Understanding Training Dynamics and Token Composition in 1-layer Transformer
Yuandong Tian
Yiping Wang
Beidi Chen
S. Du
MLT
115
79
0
25 May 2023
Over-Parameterization Exponentially Slows Down Gradient Descent for
  Learning a Single Neuron
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron
Weihang Xu
S. Du
108
16
0
20 Feb 2023
Bayesian Interpolation with Deep Linear Networks
Bayesian Interpolation with Deep Linear Networks
Boris Hanin
Alexander Zlokapa
153
26
0
29 Dec 2022
Finite Sample Identification of Wide Shallow Neural Networks with Biases
Finite Sample Identification of Wide Shallow Neural Networks with Biases
M. Fornasier
T. Klock
Marco Mondelli
Michael Rauchensteiner
62
6
0
08 Nov 2022
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Annihilation of Spurious Minima in Two-Layer ReLU Networks
Yossi Arjevani
M. Field
65
8
0
12 Oct 2022
Explicitising The Implicit Intrepretability of Deep Neural Networks Via
  Duality
Explicitising The Implicit Intrepretability of Deep Neural Networks Via Duality
Chandrashekar Lakshminarayanan
Ashutosh Kumar Singh
A. Rajkumar
AI4CE
82
1
0
01 Mar 2022
Benign Overfitting in Two-layer Convolutional Neural Networks
Benign Overfitting in Two-layer Convolutional Neural Networks
Yuan Cao
Zixiang Chen
M. Belkin
Quanquan Gu
MLT
98
90
0
14 Feb 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Understanding Deep Contrastive Learning via Coordinate-wise Optimization
Yuandong Tian
186
35
0
29 Jan 2022
Generalization Performance of Empirical Risk Minimization on
  Over-parameterized Deep ReLU Nets
Generalization Performance of Empirical Risk Minimization on Over-parameterized Deep ReLU Nets
Shao-Bo Lin
Yao Wang
Ding-Xuan Zhou
ODL
92
6
0
28 Nov 2021
Mode connectivity in the loss landscape of parameterized quantum
  circuits
Mode connectivity in the loss landscape of parameterized quantum circuits
Kathleen E. Hamilton
E. Lynn
R. Pooser
72
3
0
09 Nov 2021
GradSign: Model Performance Inference with Theoretical Insights
GradSign: Model Performance Inference with Theoretical Insights
Zhihao Zhang
Zhihao Jia
82
24
0
16 Oct 2021
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity
  on Pruned Neural Networks
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
UQCVMLT
89
13
0
12 Oct 2021
Constants of Motion: The Antidote to Chaos in Optimization and Game
  Dynamics
Constants of Motion: The Antidote to Chaos in Optimization and Game Dynamics
Georgios Piliouras
Xiao Wang
69
0
0
08 Sep 2021
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural
  Networks: A Tale of Symmetry II
Analytic Study of Families of Spurious Minima in Two-Layer ReLU Neural Networks: A Tale of Symmetry II
Yossi Arjevani
M. Field
72
19
0
21 Jul 2021
Continual Learning in the Teacher-Student Setup: Impact of Task
  Similarity
Continual Learning in the Teacher-Student Setup: Impact of Task Similarity
Sebastian Lee
Sebastian Goldt
Andrew M. Saxe
CLL
88
75
0
09 Jul 2021
Neural Active Learning with Performance Guarantees
Neural Active Learning with Performance Guarantees
Pranjal Awasthi
Christoph Dann
Claudio Gentile
Ayush Sekhari
Zhilei Wang
56
22
0
06 Jun 2021
From Local Pseudorandom Generators to Hardness of Learning
From Local Pseudorandom Generators to Hardness of Learning
Amit Daniely
Gal Vardi
132
32
0
20 Jan 2021
Learning Graph Neural Networks with Approximate Gradient Descent
Learning Graph Neural Networks with Approximate Gradient Descent
Qunwei Li
Shaofeng Zou
Leon Wenliang Zhong
GNN
107
1
0
07 Dec 2020
Align, then memorise: the dynamics of learning with feedback alignment
Align, then memorise: the dynamics of learning with feedback alignment
Maria Refinetti
Stéphane dÁscoli
Ruben Ohana
Sebastian Goldt
107
37
0
24 Nov 2020
Inductive Bias of Gradient Descent for Weight Normalized Smooth
  Homogeneous Neural Nets
Inductive Bias of Gradient Descent for Weight Normalized Smooth Homogeneous Neural Nets
Depen Morwani
H. G. Ramaswamy
56
3
0
24 Oct 2020
Computational Separation Between Convolutional and Fully-Connected
  Networks
Computational Separation Between Convolutional and Fully-Connected Networks
Eran Malach
Shai Shalev-Shwartz
95
26
0
03 Oct 2020
Generalized Leverage Score Sampling for Neural Networks
Generalized Leverage Score Sampling for Neural Networks
Jason D. Lee
Ruoqi Shen
Zhao Song
Mengdi Wang
Zheng Yu
79
43
0
21 Sep 2020
Nonparametric Learning of Two-Layer ReLU Residual Units
Nonparametric Learning of Two-Layer ReLU Residual Units
Zhunxuan Wang
Linyun He
Chunchuan Lyu
Shay B. Cohen
MLTOffRL
204
1
0
17 Aug 2020
Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale
  of Symmetry
Analytic Characterization of the Hessian in Shallow ReLU Models: A Tale of Symmetry
Yossi Arjevani
M. Field
55
16
0
04 Aug 2020
From Boltzmann Machines to Neural Networks and Back Again
From Boltzmann Machines to Neural Networks and Back Again
Surbhi Goel
Adam R. Klivans
Frederic Koehler
51
5
0
25 Jul 2020
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Learning Over-Parametrized Two-Layer ReLU Neural Networks beyond NTK
Yuanzhi Li
Tengyu Ma
Hongyang R. Zhang
MLT
95
27
0
09 Jul 2020
Towards an Understanding of Residual Networks Using Neural Tangent
  Hierarchy (NTH)
Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)
Yuqing Li
Yaoyu Zhang
N. Yip
55
5
0
07 Jul 2020
Optimization and Generalization of Shallow Neural Networks with
  Quadratic Activation Functions
Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions
Stefano Sarao Mannelli
Eric Vanden-Eijnden
Lenka Zdeborová
AI4CE
79
49
0
27 Jun 2020
The Gaussian equivalence of generative models for learning with shallow
  neural networks
The Gaussian equivalence of generative models for learning with shallow neural networks
Sebastian Goldt
Bruno Loureiro
Galen Reeves
Florent Krzakala
M. Mézard
Lenka Zdeborová
BDL
114
107
0
25 Jun 2020
Fast Learning of Graph Neural Networks with Guaranteed Generalizability:
  One-hidden-layer Case
Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case
Shuai Zhang
Meng Wang
Sijia Liu
Pin-Yu Chen
Jinjun Xiong
MLTAI4CE
120
34
0
25 Jun 2020
Hardness of Learning Neural Networks with Natural Weights
Hardness of Learning Neural Networks with Natural Weights
Amit Daniely
Gal Vardi
77
19
0
05 Jun 2020
Understanding and Improving Information Transfer in Multi-Task Learning
Understanding and Improving Information Transfer in Multi-Task Learning
Sen Wu
Hongyang R. Zhang
Christopher Ré
80
158
0
02 May 2020
Piecewise linear activations substantially shape the loss surfaces of
  neural networks
Piecewise linear activations substantially shape the loss surfaces of neural networks
Fengxiang He
Bohan Wang
Dacheng Tao
ODL
93
30
0
27 Mar 2020
Symmetry & critical points for a model shallow neural network
Symmetry & critical points for a model shallow neural network
Yossi Arjevani
M. Field
117
13
0
23 Mar 2020
An Optimization and Generalization Analysis for Max-Pooling Networks
An Optimization and Generalization Analysis for Max-Pooling Networks
Alon Brutzkus
Amir Globerson
MLTAI4CE
59
4
0
22 Feb 2020
Replica Exchange for Non-Convex Optimization
Replica Exchange for Non-Convex Optimization
Jing-rong Dong
Xin T. Tong
110
21
0
23 Jan 2020
Thresholds of descending algorithms in inference problems
Thresholds of descending algorithms in inference problems
Stefano Sarao Mannelli
Lenka Zdeborova
AI4CE
71
4
0
02 Jan 2020
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
137
169
0
19 Dec 2019
Naive Gabor Networks for Hyperspectral Image Classification
Naive Gabor Networks for Hyperspectral Image Classification
Chenying Liu
Jun Li
Lin He
Antonio J. Plaza
Shutao Li
Bo Li
77
45
0
09 Dec 2019
Over-parametrized deep neural networks do not generalize well
Over-parametrized deep neural networks do not generalize well
Michael Kohler
A. Krzyżak
58
12
0
09 Dec 2019
Tight Sample Complexity of Learning One-hidden-layer Convolutional
  Neural Networks
Tight Sample Complexity of Learning One-hidden-layer Convolutional Neural Networks
Yuan Cao
Quanquan Gu
MLT
85
19
0
12 Nov 2019
Towards Understanding the Importance of Shortcut Connections in Residual
  Networks
Towards Understanding the Importance of Shortcut Connections in Residual Networks
Tianyi Liu
Minshuo Chen
Mo Zhou
S. Du
Enlu Zhou
T. Zhao
60
45
0
10 Sep 2019
Towards Understanding the Importance of Noise in Training Neural
  Networks
Towards Understanding the Importance of Noise in Training Neural Networks
Mo Zhou
Tianyi Liu
Yan Li
Dachao Lin
Enlu Zhou
T. Zhao
MLT
92
26
0
07 Sep 2019
Theoretical Issues in Deep Networks: Approximation, Optimization and
  Generalization
Theoretical Issues in Deep Networks: Approximation, Optimization and Generalization
T. Poggio
Andrzej Banburski
Q. Liao
ODL
128
165
0
25 Aug 2019
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Hessian based analysis of SGD for Deep Nets: Dynamics and Generalization
Xinyan Li
Qilong Gu
Yingxue Zhou
Tiancong Chen
A. Banerjee
ODL
97
52
0
24 Jul 2019
Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked
  Matrix-Tensor Model
Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model
Stefano Sarao Mannelli
Giulio Biroli
C. Cammarota
Florent Krzakala
Lenka Zdeborová
62
43
0
18 Jul 2019
Towards Explaining the Regularization Effect of Initial Large Learning
  Rate in Training Neural Networks
Towards Explaining the Regularization Effect of Initial Large Learning Rate in Training Neural Networks
Yuanzhi Li
Colin Wei
Tengyu Ma
101
300
0
10 Jul 2019
123
Next