On skip connections and normalisation layers in deep optimisation

10 October 2022

Papers citing "On skip connections and normalisation layers in deep optimisation"

7 / 7 papers shown

Title
On progressive sharpening, flat minima and generalisation L. MacDonald Jack Valmadre Simon Lucey 27 4 0 24 May 2023
Understanding Gradient Descent on Edge of Stability in Deep Learning Sanjeev Arora Zhiyuan Li A. Panigrahi MLT 83 90 0 19 May 2022
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping James Martens Andy Ballard Guillaume Desjardins G. Swirszcz Valentin Dalibard Jascha Narain Sohl-Dickstein S. Schoenholz 88 43 0 05 Oct 2021
On the Proof of Global Convergence of Gradient Descent for Deep ReLU Networks with Linear Widths Quynh N. Nguyen 43 48 0 24 Jan 2021
RepVGG: Making VGG-style ConvNets Great Again Xiaohan Ding Xinming Zhang Ningning Ma Jungong Han Guiguang Ding Jian Sun 136 1,549 0 11 Jan 2021
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 242 348 0 14 Jun 2018
Linear Convergence of Gradient and Proximal-Gradient Methods Under the Polyak-Łojasiewicz Condition Hamed Karimi J. Nutini Mark W. Schmidt 139 1,201 0 16 Aug 2016