ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1702.07966
  4. Cited By
Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

26 February 2017
Alon Brutzkus
Amir Globerson
    MLT
ArXivPDFHTML

Papers citing "Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs"

37 / 87 papers shown
Title
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
21
94
0
24 Jan 2019
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU
  Networks
Stochastic Gradient Descent Optimizes Over-parameterized Deep ReLU Networks
Difan Zou
Yuan Cao
Dongruo Zhou
Quanquan Gu
ODL
33
446
0
21 Nov 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,122
0
09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
23
191
0
29 Oct 2018
Subgradient Descent Learns Orthogonal Dictionaries
Subgradient Descent Learns Orthogonal Dictionaries
Yu Bai
Qijia Jiang
Ju Sun
20
51
0
25 Oct 2018
Small ReLU networks are powerful memorizers: a tight analysis of
  memorization capacity
Small ReLU networks are powerful memorizers: a tight analysis of memorization capacity
Chulhee Yun
S. Sra
Ali Jadbabaie
26
117
0
17 Oct 2018
Learning Two-layer Neural Networks with Symmetric Inputs
Learning Two-layer Neural Networks with Symmetric Inputs
Rong Ge
Rohith Kuditipudi
Zhize Li
Xiang Wang
OOD
MLT
36
57
0
16 Oct 2018
Why do Larger Models Generalize Better? A Theoretical Perspective via
  the XOR Problem
Why do Larger Models Generalize Better? A Theoretical Perspective via the XOR Problem
Alon Brutzkus
Amir Globerson
MLT
11
7
0
06 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
27
281
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
53
1,250
0
04 Oct 2018
Towards Understanding Regularization in Batch Normalization
Towards Understanding Regularization in Batch Normalization
Ping Luo
Xinjiang Wang
Wenqi Shao
Zhanglin Peng
MLT
AI4CE
23
179
0
04 Sep 2018
Blended Coarse Gradient Descent for Full Quantization of Deep Neural
  Networks
Blended Coarse Gradient Descent for Full Quantization of Deep Neural Networks
Penghang Yin
Shuai Zhang
J. Lyu
Stanley Osher
Y. Qi
Jack Xin
MQ
44
61
0
15 Aug 2018
Learning ReLU Networks on Linearly Separable Data: Algorithm,
  Optimality, and Generalization
Learning ReLU Networks on Linearly Separable Data: Algorithm, Optimality, and Generalization
G. Wang
G. Giannakis
Jie Chen
MLT
24
131
0
14 Aug 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
28
134
0
20 Jun 2018
Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex
Deep Neural Networks with Multi-Branch Architectures Are Less Non-Convex
Hongyang R. Zhang
Junru Shao
Ruslan Salakhutdinov
39
14
0
06 Jun 2018
Adding One Neuron Can Eliminate All Bad Local Minima
Adding One Neuron Can Eliminate All Bad Local Minima
Shiyu Liang
Ruoyu Sun
J. Lee
R. Srikant
37
89
0
22 May 2018
How Many Samples are Needed to Estimate a Convolutional or Recurrent
  Neural Network?
How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?
S. Du
Yining Wang
Xiyu Zhai
Sivaraman Balakrishnan
Ruslan Salakhutdinov
Aarti Singh
SSL
21
57
0
21 May 2018
Improved Learning of One-hidden-layer Convolutional Neural Networks with
  Overlaps
Improved Learning of One-hidden-layer Convolutional Neural Networks with Overlaps
S. Du
Surbhi Goel
MLT
30
17
0
20 May 2018
End-to-end Learning of a Convolutional Neural Network via Deep Tensor
  Decomposition
End-to-end Learning of a Convolutional Neural Network via Deep Tensor Decomposition
Samet Oymak
Mahdi Soltanolkotabi
21
12
0
16 May 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
43
850
0
18 Apr 2018
Understanding the Loss Surface of Neural Networks for Binary
  Classification
Understanding the Loss Surface of Neural Networks for Binary Classification
Shiyu Liang
Ruoyu Sun
Yixuan Li
R. Srikant
21
87
0
19 Feb 2018
Gradient descent with identity initialization efficiently learns
  positive definite linear transformations by deep residual networks
Gradient descent with identity initialization efficiently learns positive definite linear transformations by deep residual networks
Peter L. Bartlett
D. Helmbold
Philip M. Long
33
116
0
16 Feb 2018
Learning One Convolutional Layer with Overlapping Patches
Learning One Convolutional Layer with Overlapping Patches
Surbhi Goel
Adam R. Klivans
Raghu Meka
MLT
16
80
0
07 Feb 2018
Learning Compact Neural Networks with Regularization
Learning Compact Neural Networks with Regularization
Samet Oymak
MLT
41
39
0
05 Feb 2018
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
Spurious Local Minima are Common in Two-Layer ReLU Neural Networks
Itay Safran
Ohad Shamir
40
261
0
24 Dec 2017
Non-convex Optimization for Machine Learning
Non-convex Optimization for Machine Learning
Prateek Jain
Purushottam Kar
33
479
0
21 Dec 2017
SGD Learns Over-parameterized Networks that Provably Generalize on
  Linearly Separable Data
SGD Learns Over-parameterized Networks that Provably Generalize on Linearly Separable Data
Alon Brutzkus
Amir Globerson
Eran Malach
Shai Shalev-Shwartz
MLT
50
276
0
27 Oct 2017
Theoretical insights into the optimization landscape of
  over-parameterized shallow neural networks
Theoretical insights into the optimization landscape of over-parameterized shallow neural networks
Mahdi Soltanolkotabi
Adel Javanmard
J. Lee
36
415
0
16 Jul 2017
Recovery Guarantees for One-hidden-layer Neural Networks
Recovery Guarantees for One-hidden-layer Neural Networks
Kai Zhong
Zhao Song
Prateek Jain
Peter L. Bartlett
Inderjit S. Dhillon
MLT
34
336
0
10 Jun 2017
On the stable recovery of deep structured linear networks under sparsity
  constraints
On the stable recovery of deep structured linear networks under sparsity constraints
F. Malgouyres
24
7
0
31 May 2017
Learning ReLUs via Gradient Descent
Learning ReLUs via Gradient Descent
Mahdi Soltanolkotabi
MLT
20
181
0
10 May 2017
Estimating the Coefficients of a Mixture of Two Linear Regressions by
  Expectation Maximization
Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization
Jason M. Klusowski
Dana Yang
W. Brinda
34
41
0
26 Apr 2017
The loss surface of deep and wide neural networks
The loss surface of deep and wide neural networks
Quynh N. Nguyen
Matthias Hein
ODL
51
283
0
26 Apr 2017
Convergence Results for Neural Networks via Electrodynamics
Convergence Results for Neural Networks via Electrodynamics
Rina Panigrahy
Sushant Sachdeva
Qiuyi Zhang
MLT
MDE
29
22
0
01 Feb 2017
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
Approximation by Combinations of ReLU and Squared ReLU Ridge Functions
  with $ \ell^1 $ and $ \ell^0 $ Controls
Approximation by Combinations of ReLU and Squared ReLU Ridge Functions with ℓ1 \ell^1 ℓ1 and ℓ0 \ell^0 ℓ0 Controls
Jason M. Klusowski
Andrew R. Barron
132
142
0
26 Jul 2016
The Loss Surfaces of Multilayer Networks
The Loss Surfaces of Multilayer Networks
A. Choromańska
Mikael Henaff
Michaël Mathieu
Gerard Ben Arous
Yann LeCun
ODL
183
1,185
0
30 Nov 2014
Previous
12