The loss surface of deep and wide neural networks

26 April 2017

Matthias Hein

Papers citing "The loss surface of deep and wide neural networks"

50 / 64 papers shown

Title
Low-Loss Space in Neural Networks is Continuous and Fully Connected Yongding Tian Zaid Al-Ars Maksim Kitsak P. Hofstee 3DPC 31 0 0 05 May 2025
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 45 0 0 08 Feb 2024
Sparse Deep Learning for Time Series Data: Theory and Applications Mingxuan Zhang Y. Sun Faming Liang AI4TS OOD BDL 39 2 0 05 Oct 2023
How Spurious Features Are Memorized: Precise Analysis for Random and NTK Features Simone Bombari Marco Mondelli AAML 31 4 0 20 May 2023
Online Learning Under A Separable Stochastic Approximation Framework Min Gan Xiang-Xiang Su Guang-yong Chen Jing Chen 28 0 0 12 May 2023
Revisiting the Noise Model of Stochastic Gradient Descent Barak Battash Ofir Lindenbaum 27 9 0 05 Mar 2023
Mechanistic Mode Connectivity Ekdeep Singh Lubana Eric J. Bigelow Robert P. Dick David M. Krueger Hidenori Tanaka 32 45 0 15 Nov 2022
MAC: A Meta-Learning Approach for Feature Learning and Recombination S. Tiwari M. Gogoi S. Verma K. P. Singh CLL 37 1 0 20 Sep 2022
Wavelet Regularization Benefits Adversarial Training Jun Yan Huilin Yin Xiaoyang Deng Zi-qin Zhao Wancheng Ge Hao Zhang Gerhard Rigoll AAML 19 2 0 08 Jun 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape Devansh Bisla Jing Wang A. Choromańska 25 34 0 20 Jan 2022
A Kernel-Expanded Stochastic Neural Network Y. Sun F. Liang 23 5 0 14 Jan 2022
On the Global Convergence of Gradient Descent for multi-layer ResNets in the mean-field regime Zhiyan Ding Shi Chen Qin Li S. Wright MLT AI4CE 41 11 0 06 Oct 2021
Exponentially Many Local Minima in Quantum Neural Networks Xuchen You Xiaodi Wu 72 51 0 06 Oct 2021
Self-Paced Contrastive Learning for Semi-supervised Medical Image Segmentation with Meta-labels Jizong Peng Ping Wang Chrisitian Desrosiers M. Pedersoli SSL 31 63 0 29 Jul 2021
The loss landscape of deep linear neural networks: a second-order analysis E. M. Achour Franccois Malgouyres Sébastien Gerchinovitz ODL 24 9 0 28 Jul 2021
Deep Network Approximation: Achieving Arbitrary Accuracy with Fixed Number of Neurons Zuowei Shen Haizhao Yang Shijun Zhang 56 36 0 06 Jul 2021
The Limitations of Large Width in Neural Networks: A Deep Gaussian Process Perspective Geoff Pleiss John P. Cunningham 28 24 0 11 Jun 2021
Landscape analysis for shallow neural networks: complete classification of critical points for affine target functions Patrick Cheridito Arnulf Jentzen Florian Rossmannek 24 10 0 19 Mar 2021
Optimal Approximation Rate of ReLU Networks in terms of Width and Depth Zuowei Shen Haizhao Yang Shijun Zhang 103 115 0 28 Feb 2021
Noisy Gradient Descent Converges to Flat Minima for Nonconvex Matrix Factorization Tianyi Liu Yan Li S. Wei Enlu Zhou T. Zhao 21 13 0 24 Feb 2021
Tight Bounds on the Smallest Eigenvalue of the Neural Tangent Kernel for Deep ReLU Networks Quynh N. Nguyen Marco Mondelli Guido Montúfar 25 81 0 21 Dec 2020
Learning Graph Neural Networks with Approximate Gradient Descent Qunwei Li Shaofeng Zou Leon Wenliang Zhong GNN 32 1 0 07 Dec 2020
It's Hard for Neural Networks To Learn the Game of Life Jacob Mitchell Springer Garrett Kenyon 19 21 0 03 Sep 2020
The Landscape of Matrix Factorization Revisited Hossein Valavi Sulin Liu Peter J. Ramadge 17 5 0 27 Feb 2020
On Interpretability of Artificial Neural Networks: A Survey Fenglei Fan Jinjun Xiong Mengzhou Li Ge Wang AAML AI4CE 38 300 0 08 Jan 2020
Revisiting Landscape Analysis in Deep Neural Networks: Eliminating Decreasing Paths to Infinity Shiyu Liang Ruoyu Sun R. Srikant 35 19 0 31 Dec 2019
Insights into Ordinal Embedding Algorithms: A Systematic Evaluation L. C. Vankadara Siavash Haghiri Michael Lohaus Faiz Ul Wahab U. V. Luxburg 15 7 0 03 Dec 2019
Beyond Linearization: On Quadratic and Higher-Order Approximation of Wide Neural Networks Yu Bai J. Lee 24 116 0 03 Oct 2019
Transferability and Hardness of Supervised Classification Tasks Anh Tran Cuong V Nguyen Tal Hassner 134 164 0 21 Aug 2019
Weight-space symmetry in deep networks gives rise to permutation saddles, connected by equal-loss valleys across the loss landscape Johanni Brea Berfin Simsek Bernd Illing W. Gerstner 23 55 0 05 Jul 2019
Deep Network Approximation Characterized by Number of Neurons Zuowei Shen Haizhao Yang Shijun Zhang 23 182 0 13 Jun 2019
Fine-grained Optimization of Deep Neural Networks Mete Ozay ODL 14 1 0 22 May 2019
Every Local Minimum Value is the Global Minimum Value of Induced Model in Non-convex Machine Learning Kenji Kawaguchi Jiaoyang Huang L. Kaelbling AAML 24 18 0 07 Apr 2019
Nonlinear Approximation via Compositions Zuowei Shen Haizhao Yang Shijun Zhang 26 92 0 26 Feb 2019
Fine-Grained Analysis of Optimization and Generalization for Overparameterized Two-Layer Neural Networks Sanjeev Arora S. Du Wei Hu Zhiyuan Li Ruosong Wang MLT 37 961 0 24 Jan 2019
Width Provably Matters in Optimization for Deep Linear Neural Networks S. Du Wei Hu 21 94 0 24 Jan 2019
Non-attracting Regions of Local Minima in Deep and Wide Neural Networks Henning Petzka C. Sminchisescu 29 9 0 16 Dec 2018
Gradient Descent Finds Global Minima of Deep Neural Networks S. Du J. Lee Haochuan Li Liwei Wang Masayoshi Tomizuka ODL 44 1,122 0 09 Nov 2018
On the Convergence Rate of Training Recurrent Neural Networks Zeyuan Allen-Zhu Yuanzhi Li Zhao Song 18 191 0 29 Oct 2018
Benefits of over-parameterization with EM Ji Xu Daniel J. Hsu A. Maleki 38 29 0 26 Oct 2018
Regularization Matters: Generalization and Optimization of Neural Nets v.s. their Induced Kernel Colin Wei J. Lee Qiang Liu Tengyu Ma 23 245 0 12 Oct 2018
A Convergence Analysis of Gradient Descent for Deep Linear Neural Networks Sanjeev Arora Nadav Cohen Noah Golowich Wei Hu 27 281 0 04 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 38 1,250 0 04 Oct 2018
Implicit Self-Regularization in Deep Neural Networks: Evidence from Random Matrix Theory and Implications for Learning Charles H. Martin Michael W. Mahoney AI4CE 35 191 0 02 Oct 2018
Filter Distillation for Network Compression Xavier Suau Luca Zappella N. Apostoloff 24 38 0 20 Jul 2018
Efficient Decentralized Deep Learning by Dynamic Model Averaging Michael Kamp Linara Adilova Joachim Sicking Fabian Hüger Peter Schlicht Tim Wirtz Stefan Wrobel 32 129 0 09 Jul 2018
ResNet with one-neuron hidden layers is a Universal Approximator Hongzhou Lin Stefanie Jegelka 41 227 0 28 Jun 2018
Learning One-hidden-layer ReLU Networks via Gradient Descent Xiao Zhang Yaodong Yu Lingxiao Wang Quanquan Gu MLT 28 134 0 20 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach Ryo Karakida S. Akaho S. Amari FedML 47 140 0 04 Jun 2018
How Many Samples are Needed to Estimate a Convolutional or Recurrent Neural Network? S. Du Yining Wang Xiyu Zhai Sivaraman Balakrishnan Ruslan Salakhutdinov Aarti Singh SSL 21 57 0 21 May 2018