On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths

24 January 2021

Papers citing "On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths"

15 / 15 papers shown

Title
Feature Learning Beyond the Edge of Stability Dávid Terjék MLT 43 0 0 18 Feb 2025
Understanding the training of infinitely deep and wide ResNets with Conditional Optimal Transport Raphael Barboni Gabriel Peyré Franccois-Xavier Vialard 37 3 0 19 Mar 2024
Global Convergence Rate of Deep Equilibrium Models with General Activations Lan V. Truong 39 2 0 11 Feb 2023
Learning Lipschitz Functions by GD-trained Shallow Overparameterized ReLU Neural Networks Ilja Kuzborskij Csaba Szepesvári 21 4 0 28 Dec 2022
Characterizing the Spectrum of the NTK via a Power Series Expansion Michael Murray Hui Jin Benjamin Bowman Guido Montúfar 35 11 0 15 Nov 2022
When Expressivity Meets Trainability: Fewer than $n$ Neurons Can Work Jiawei Zhang Yushun Zhang Mingyi Hong Ruoyu Sun Zhi-Quan Luo 26 10 0 21 Oct 2022
On skip connections and normalisation layers in deep optimisation L. MacDonald Jack Valmadre Hemanth Saratchandran Simon Lucey ODL 19 1 0 10 Oct 2022
Generalization Properties of NAS under Activation and Skip Connection Search Zhenyu Zhu Fanghui Liu Grigorios G. Chrysos V. Cevher AI4CE 28 15 0 15 Sep 2022
Global Convergence of Over-parameterized Deep Equilibrium Models Zenan Ling Xingyu Xie Qiuhao Wang Zongpeng Zhang Zhouchen Lin 32 12 0 27 May 2022
Improved Overparametrization Bounds for Global Convergence of Stochastic Gradient Descent for Shallow Neural Networks Bartlomiej Polaczyk J. Cyranka ODL 33 3 0 28 Jan 2022
On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons Fangshuo Liao Anastasios Kyrillidis 38 16 0 05 Dec 2021
Offline Neural Contextual Bandits: Pessimism, Optimization and Generalization Thanh Nguyen-Tang Sunil R. Gupta A. Nguyen Svetha Venkatesh OffRL 24 28 0 27 Nov 2021
On Provable Benefits of Depth in Training Graph Convolutional Networks Weilin Cong M. Ramezani M. Mahdavi 24 73 0 28 Oct 2021
A global convergence theory for deep ReLU implicit networks via over-parameterization Tianxiang Gao Hailiang Liu Jia Liu Hridesh Rajan Hongyang Gao MLT 23 16 0 11 Oct 2021
On the Implicit Bias of Initialization Shape: Beyond Infinitesimal Mirror Descent Shahar Azulay E. Moroshko Mor Shpigel Nacson Blake E. Woodworth Nathan Srebro Amir Globerson Daniel Soudry AI4CE 30 73 0 19 Feb 2021