ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1704.08045
  4. Cited By
The loss surface of deep and wide neural networks

The loss surface of deep and wide neural networks

26 April 2017
Quynh N. Nguyen
Matthias Hein
    ODL
ArXivPDFHTML

Papers citing "The loss surface of deep and wide neural networks"

50 / 64 papers shown
Title
Low-Loss Space in Neural Networks is Continuous and Fully Connected
Low-Loss Space in Neural Networks is Continuous and Fully Connected
Yongding Tian
Zaid Al-Ars
Maksim Kitsak
P. Hofstee
3DPC
31
0
0
05 May 2025
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding
Zhengqing Wu
Berfin Simsek
Francois Ged
ODL
45
0
0
08 Feb 2024
Sparse Deep Learning for Time Series Data: Theory and Applications
Sparse Deep Learning for Time Series Data: Theory and Applications
Mingxuan Zhang
Y. Sun
Faming Liang
AI4TS
OOD
BDL
39
2
0
05 Oct 2023
How Spurious Features Are Memorized: Precise Analysis for Random and NTK
  Features
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features
Simone Bombari
Marco Mondelli
AAML
31
4
0
20 May 2023
Online Learning Under A Separable Stochastic Approximation Framework
Online Learning Under A Separable Stochastic Approximation Framework
Min Gan
Xiang-Xiang Su
Guang-yong Chen
Jing Chen
28
0
0
12 May 2023
Revisiting the Noise Model of Stochastic Gradient Descent
Revisiting the Noise Model of Stochastic Gradient Descent
Barak Battash
Ofir Lindenbaum
27
9
0
05 Mar 2023
Mechanistic Mode Connectivity
Mechanistic Mode Connectivity
Ekdeep Singh Lubana
Eric J. Bigelow
Robert P. Dick
David M. Krueger
Hidenori Tanaka
32
45
0
15 Nov 2022
MAC: A Meta-Learning Approach for Feature Learning and Recombination
MAC: A Meta-Learning Approach for Feature Learning and Recombination
S. Tiwari
M. Gogoi
S. Verma
K. P. Singh
CLL
37
1
0
20 Sep 2022
Wavelet Regularization Benefits Adversarial Training
Wavelet Regularization Benefits Adversarial Training
Jun Yan
Huilin Yin
Xiaoyang Deng
Zi-qin Zhao
Wancheng Ge
Hao Zhang
Gerhard Rigoll
AAML
19
2
0
08 Jun 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning
  Optimization Landscape
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape
Devansh Bisla
Jing Wang
A. Choromańska
25
34
0
20 Jan 2022
A Kernel-Expanded Stochastic Neural Network
A Kernel-Expanded Stochastic Neural Network
Y. Sun
F. Liang
23
5
0
14 Jan 2022
On the Global Convergence of Gradient Descent for multi-layer ResNets in
  the mean-field regime
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime
Zhiyan Ding
Shi Chen
Qin Li
S. Wright
MLT
AI4CE
41
11
0
06 Oct 2021
Exponentially Many Local Minima in Quantum Neural Networks
Exponentially Many Local Minima in Quantum Neural Networks
Xuchen You
Xiaodi Wu
72
51
0
06 Oct 2021
Self-Paced Contrastive Learning for Semi-supervised Medical Image
  Segmentation with Meta-labels
Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels
Jizong Peng
Ping Wang
Chrisitian Desrosiers
M. Pedersoli
SSL
31
63
0
29 Jul 2021
The loss landscape of deep linear neural networks: a second-order
  analysis
The loss landscape of deep linear neural networks: a second-order analysis
E. M. Achour
Franccois Malgouyres
Sébastien Gerchinovitz
ODL
24
9
0
28 Jul 2021
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed
  Number of Neurons
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons
Zuowei Shen
Haizhao Yang
Shijun Zhang
56
36
0
06 Jul 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian
  Process Perspective
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective
Geoff Pleiss
John P. Cunningham
28
24
0
11 Jun 2021
Landscape analysis for shallow neural networks: complete classification
  of critical points for affine target functions
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions
Patrick Cheridito
Arnulf Jentzen
Florian Rossmannek
24
10
0
19 Mar 2021
Optimal Approximation Rate of ReLU Networks in terms of Width and Depth
Optimal Approximation Rate of ReLU Networks in terms of Width and Depth
Zuowei Shen
Haizhao Yang
Shijun Zhang
103
115
0
28 Feb 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix
  Factorization
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization
Tianyi Liu
Yan Li
S. Wei
Enlu Zhou
T. Zhao
21
13
0
24 Feb 2021
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for
  Deep ReLU Networks
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks
Quynh N. Nguyen
Marco Mondelli
Guido Montúfar
25
81
0
21 Dec 2020
Learning Graph Neural Networks with Approximate Gradient Descent
Learning Graph Neural Networks with Approximate Gradient Descent
Qunwei Li
Shaofeng Zou
Leon Wenliang Zhong
GNN
32
1
0
07 Dec 2020
It's Hard for Neural Networks To Learn the Game of Life
It's Hard for Neural Networks To Learn the Game of Life
Jacob Mitchell Springer
Garrett Kenyon
19
21
0
03 Sep 2020
The Landscape of Matrix Factorization Revisited
The Landscape of Matrix Factorization Revisited
Hossein Valavi
Sulin Liu
Peter J. Ramadge
17
5
0
27 Feb 2020
On Interpretability of Artificial Neural Networks: A Survey
On Interpretability of Artificial Neural Networks: A Survey
Fenglei Fan
Jinjun Xiong
Mengzhou Li
Ge Wang
AAML
AI4CE
38
300
0
08 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating
  Decreasing Paths to Infinity
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity
Shiyu Liang
Ruoyu Sun
R. Srikant
35
19
0
31 Dec 2019
Insights into Ordinal Embedding Algorithms: A Systematic Evaluation
Insights into Ordinal Embedding Algorithms: A Systematic Evaluation
L. C. Vankadara
Siavash Haghiri
Michael Lohaus
Faiz Ul Wahab
U. V. Luxburg
15
7
0
03 Dec 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of
  Wide Neural Networks
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks
Yu Bai
J. Lee
24
116
0
03 Oct 2019
Transferability and Hardness of Supervised Classification Tasks
Transferability and Hardness of Supervised Classification Tasks
Anh Tran
Cuong V Nguyen
Tal Hassner
134
164
0
21 Aug 2019
Weight-space symmetry in deep networks gives rise to permutation
  saddles, connected by equal-loss valleys across the loss landscape
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape
Johanni Brea
Berfin Simsek
Bernd Illing
W. Gerstner
23
55
0
05 Jul 2019
Deep Network Approximation Characterized by Number of Neurons
Deep Network Approximation Characterized by Number of Neurons
Zuowei Shen
Haizhao Yang
Shijun Zhang
23
182
0
13 Jun 2019
Fine-grained Optimization of Deep Neural Networks
Fine-grained Optimization of Deep Neural Networks
Mete Ozay
ODL
14
1
0
22 May 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model
  in Non-convex Machine Learning
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning
Kenji Kawaguchi
Jiaoyang Huang
L. Kaelbling
AAML
24
18
0
07 Apr 2019
Nonlinear Approximation via Compositions
Nonlinear Approximation via Compositions
Zuowei Shen
Haizhao Yang
Shijun Zhang
26
92
0
26 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for
  Overparameterized Two-Layer Neural Networks
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks
Sanjeev Arora
S. Du
Wei Hu
Zhiyuan Li
Ruosong Wang
MLT
37
961
0
24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks
Width Provably Matters in Optimization for Deep Linear Neural Networks
S. Du
Wei Hu
21
94
0
24 Jan 2019
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks
Henning Petzka
C. Sminchisescu
29
9
0
16 Dec 2018
Gradient Descent Finds Global Minima of Deep Neural Networks
Gradient Descent Finds Global Minima of Deep Neural Networks
S. Du
J. Lee
Haochuan Li
Liwei Wang
Masayoshi Tomizuka
ODL
44
1,122
0
09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks
On the Convergence Rate of Training Recurrent Neural Networks
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
18
191
0
29 Oct 2018
Benefits of over-parameterization with EM
Benefits of over-parameterization with EM
Ji Xu
Daniel J. Hsu
A. Maleki
38
29
0
26 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets
  v.s. their Induced Kernel
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel
Colin Wei
J. Lee
Qiang Liu
Tengyu Ma
23
245
0
12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural
  Networks
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks
Sanjeev Arora
Nadav Cohen
Noah Golowich
Wei Hu
27
281
0
04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLT
ODL
38
1,250
0
04 Oct 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from
  Random Matrix Theory and Implications for Learning
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning
Charles H. Martin
Michael W. Mahoney
AI4CE
35
191
0
02 Oct 2018
Filter Distillation for Network Compression
Filter Distillation for Network Compression
Xavier Suau
Luca Zappella
N. Apostoloff
24
38
0
20 Jul 2018
Efficient Decentralized Deep Learning by Dynamic Model Averaging
Efficient Decentralized Deep Learning by Dynamic Model Averaging
Michael Kamp
Linara Adilova
Joachim Sicking
Fabian Hüger
Peter Schlicht
Tim Wirtz
Stefan Wrobel
32
129
0
09 Jul 2018
ResNet with one-neuron hidden layers is a Universal Approximator
ResNet with one-neuron hidden layers is a Universal Approximator
Hongzhou Lin
Stefanie Jegelka
41
227
0
28 Jun 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent
Learning One-hidden-layer ReLU Networks via Gradient Descent
Xiao Zhang
Yaodong Yu
Lingxiao Wang
Quanquan Gu
MLT
28
134
0
20 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean
  Field Approach
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
47
140
0
04 Jun 2018
How Many Samples are Needed to Estimate a Convolutional or Recurrent
  Neural Network?
How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network?
S. Du
Yining Wang
Xiyu Zhai
Sivaraman Balakrishnan
Ruslan Salakhutdinov
Aarti Singh
SSL
21
57
0
21 May 2018
12
Next