A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network

4 February 2021

Papers citing "A Local Convergence Theory for Mildly Over-Parameterized Two-Layer Neural Network"

13 / 13 papers shown

Title
Over-Parameterization Exponentially Slows Down Gradient Descent for Learning a Single Neuron Weihang Xu S. Du 29 16 0 20 Feb 2023
Learning Single-Index Models with Shallow Neural Networks A. Bietti Joan Bruna Clayton Sanford M. Song 164 67 0 27 Oct 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Z. Luo 26 10 0 21 Oct 2022
Global Convergence of SGD On Two Layer Neural Nets Pulkit Gopalani Anirbit Mukherjee 23 5 0 20 Oct 2022
Robustness in deep learning: The good (width), the bad (depth), and the ugly (initialization) Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher 39 19 0 15 Sep 2022
Optimizing the Performative Risk under Weak Convexity Assumptions Yulai Zhao 19 5 0 02 Sep 2022
Intersection of Parallels as an Early Stopping Criterion Ali Vardasbi Maarten de Rijke Mostafa Dehghani MoMe 33 5 0 19 Aug 2022
Gradient flow dynamics of shallow ReLU networks for square loss and orthogonal inputs Etienne Boursier Loucas Pillaud-Vivien Nicolas Flammarion ODL 19 58 0 02 Jun 2022
On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias Itay Safran Gal Vardi Jason D. Lee MLT 51 23 0 18 May 2022
Parameter identifiability of a deep feedforward ReLU neural network Joachim Bona-Pellissier François Bachoc François Malgouyres 41 14 0 24 Dec 2021
The Convex Geometry of Backpropagation: Neural Network Gradient Flows Converge to Extreme Points of the Dual Convex Program Yifei Wang Mert Pilanci MLT MDE 47 11 0 13 Oct 2021
Sparse Bayesian Deep Learning for Dynamic System Identification Hongpeng Zhou Chahine Ibrahim W. Zheng Wei Pan BDL 21 25 0 27 Jul 2021
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition Hamed Karimi J. Nutini Mark W. Schmidt 136 1,198 0 16 Aug 2016