Deep Learning without Poor Local Minima

23 May 2016

Papers citing "Deep Learning without Poor Local Minima"

50 / 195 papers shown

Title
Uncovering Critical Sets of Deep Neural Networks via Sample-Independent Critical Lifting Leyang Zhang Yaoyu Zhang Tao Luo BDL 2 0 0 19 May 2025
System Identification and Control Using Lyapunov-Based Deep Neural Networks without Persistent Excitation: A Concurrent Learning Approach Rebecca G. Hart Omkar Sudhir Patil Zachary I. Bell Warren E. Dixon 14 0 0 15 May 2025
Stacking as Accelerated Gradient Descent Naman Agarwal Pranjal Awasthi Satyen Kale Eric Zhao ODL 73 2 0 20 Feb 2025
Effects of Random Edge-Dropping on Over-Squashing in Graph Neural Networks Jasraj Singh Keyue Jiang Brooks Paige Laura Toni 70 1 0 11 Feb 2025
Geometry and Optimization of Shallow Polynomial Networks Yossi Arjevani Joan Bruna Joe Kileel Elzbieta Polak Matthew Trager 36 1 0 10 Jan 2025
Mean-Field Analysis for Learning Subspace-Sparse Polynomials with Gaussian Input Ziang Chen Rong Ge MLT 61 1 0 10 Jan 2025
AdaRankGrad: Adaptive Gradient-Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning Yehonathan Refael Jonathan Svirsky Boris Shustin Wasim Huleihel Ofir Lindenbaum 47 3 0 31 Dec 2024
Input Space Mode Connectivity in Deep Neural Networks Jakub Vrabel Ori Shem-Ur Yaron Oz David Krueger 58 1 0 09 Sep 2024
Analysis of the rate of convergence of an over-parametrized convolutional neural network image classifier learned by gradient descent Michael Kohler A. Krzyżak Benjamin Walter 36 1 0 13 May 2024
Merging Text Transformer Models from Different Initializations Neha Verma Maha Elbayad MoMe 61 7 0 01 Mar 2024
Loss Landscape of Shallow ReLU-like Neural Networks: Stationary Points, Saddle Escape, and Network Embedding Zhengqing Wu Berfin Simsek Francois Ged ODL 48 0 0 08 Feb 2024
Critical Influence of Overparameterization on Sharpness-aware Minimization Sungbin Shin Dongyeop Lee Maksym Andriushchenko Namhoon Lee AAML 47 1 0 29 Nov 2023
Provably Accelerating Ill-Conditioned Low-rank Estimation via Scaled Gradient Descent, Even with Overparameterization Cong Ma Xingyu Xu Tian Tong Yuejie Chi 18 9 0 09 Oct 2023
Sharpness-Aware Graph Collaborative Filtering Huiyuan Chen Chin-Chia Michael Yeh Yujie Fan Yan Zheng Junpeng Wang Vivian Lai Mahashweta Das Hao Yang 36 5 0 18 Jul 2023
Snapshot Spectral Clustering -- a costless approach to deep clustering ensembles generation Adam Piróg Halina Kwasnicka 35 1 0 17 Jul 2023
Function Space and Critical Points of Linear Convolutional Networks Kathlén Kohn Guido Montúfar Vahid Shahverdi Matthew Trager 26 11 0 12 Apr 2023
Saddle-to-Saddle Dynamics in Diagonal Linear Networks Scott Pesme Nicolas Flammarion 33 35 0 02 Apr 2023
Critical Points and Convergence Analysis of Generative Deep Linear Networks Trained with Bures-Wasserstein Loss Pierre Bréchet Katerina Papagiannouli Jing An Guido Montúfar 33 3 0 06 Mar 2023
On a continuous time model of gradient descent dynamics and instability in deep learning Mihaela Rosca Yan Wu Chongli Qin Benoit Dherin 20 7 0 03 Feb 2023
Read the Signs: Towards Invariance to Gradient Descent's Hyperparameter Initialization Davood Wadi M. Fredette S. Sénécal ODL AI4CE 8 0 0 24 Jan 2023
Mechanistic Mode Connectivity Ekdeep Singh Lubana Eric J. Bigelow Robert P. Dick David M. Krueger Hidenori Tanaka 32 45 0 15 Nov 2022
A New Perspective for Understanding Generalization Gap of Deep Neural Networks Trained with Large Batch Sizes O. Oyedotun Konstantinos Papadopoulos Djamila Aouada AI4CE 32 11 0 21 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 29 10 0 21 Oct 2022
TiDAL: Learning Training Dynamics for Active Learning Seong Min Kye Kwanghee Choi Hyeongmin Byun Buru Chang 34 13 0 13 Oct 2022
Zeroth-Order Negative Curvature Finding: Escaping Saddle Points without Gradients Hualin Zhang Huan Xiong Bin Gu 35 7 0 04 Oct 2022
Analysis of the rate of convergence of an over-parametrized deep neural network estimate learned by gradient descent Michael Kohler A. Krzyżak 32 10 0 04 Oct 2022
What shapes the loss landscape of self-supervised learning? Liu Ziyin Ekdeep Singh Lubana Masakuni Ueda Hidenori Tanaka 52 20 0 02 Oct 2022
Behind the Scenes of Gradient Descent: A Trajectory Analysis via Basis Function Decomposition Jianhao Ma Li-Zhen Guo S. Fattahi 38 4 0 01 Oct 2022
Deep Linear Networks can Benignly Overfit when Shallow Ones Do Niladri S. Chatterji Philip M. Long 23 8 0 19 Sep 2022
Git Re-Basin: Merging Models modulo Permutation Symmetries Samuel K. Ainsworth J. Hayase S. Srinivasa MoMe 255 318 0 11 Sep 2022
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the Optimization Landscape Around the True Solution Jianhao Ma S. Fattahi 44 5 0 15 Jul 2022
Neural Collapse: A Review on Modelling Principles and Generalization Vignesh Kothapalli 30 74 0 08 Jun 2022
CoNSoLe: Convex Neural Symbolic Learning Haoran Li Yang Weng Hanghang Tong 27 9 0 01 Jun 2022
Star algorithm for NN ensembling Sergey Zinchenko Dmitry Lishudi FedML 11 0 0 01 Jun 2022
Non-convex online learning via algorithmic equivalence Udaya Ghai Zhou Lu Elad Hazan 14 8 0 30 May 2022
Overparameterization Improves StyleGAN Inversion Yohan Poirier-Ginter Alexandre Lessard Ryan Smith Jean-François Lalonde 48 4 0 12 May 2022
Statistical Guarantees for Approximate Stationary Points of Simple Neural Networks Mahsa Taheri Fang Xie Johannes Lederer 29 0 0 09 May 2022
On Feature Learning in Neural Networks with Global Convergence Guarantees Zhengdao Chen Eric Vanden-Eijnden Joan Bruna MLT 36 13 0 22 Apr 2022
Side Effects of Learning from Low-dimensional Data Embedded in a Euclidean Space Juncai He R. Tsai Rachel A. Ward 36 8 0 01 Mar 2022
Deep Constrained Least Squares for Blind Image Super-Resolution Ziwei Luo Haibin Huang Lei Yu Youwei Li Haoqiang Fan Shuaicheng Liu SupR 35 87 0 15 Feb 2022
PFGE: Parsimonious Fast Geometric Ensembling of DNNs Hao Guo Jiyong Jin B. Liu FedML 32 1 0 14 Feb 2022
Exact Solutions of a Deep Linear Network Liu Ziyin Botao Li Xiangmin Meng ODL 19 21 0 10 Feb 2022
Stochastic Neural Networks with Infinite Width are Deterministic Liu Ziyin Hanlin Zhang Xiangming Meng Yuting Lu Eric P. Xing Masakuni Ueda 34 3 0 30 Jan 2022
Understanding Deep Contrastive Learning via Coordinate-wise Optimization Yuandong Tian 52 34 0 29 Jan 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 33 3 0 28 Jan 2022
Low-Pass Filtering SGD for Recovering Flat Optima in the Deep Learning Optimization Landscape Devansh Bisla Jing Wang A. Choromańska 27 34 0 20 Jan 2022
Deep Network Approximation in Terms of Intrinsic Parameters Zuowei Shen Haizhao Yang Shijun Zhang 21 9 0 15 Nov 2021
Gradients are Not All You Need Luke Metz C. Freeman S. Schoenholz Tal Kachman 30 93 0 10 Nov 2021
Mode connectivity in the loss landscape of parameterized quantum circuits Kathleen E. Hamilton E. Lynn R. Pooser 29 3 0 09 Nov 2021
Imitating Deep Learning Dynamics via Locally Elastic Stochastic Differential Equations Jiayao Zhang Hua Wang Weijie J. Su 35 7 0 11 Oct 2021