Deep equilibrium networks are sensitive to initialization statistics

19 July 2022

Papers citing "Deep equilibrium networks are sensitive to initialization statistics"

26 / 26 papers shown

Title
A global convergence theory for deep ReLU implicit networks via over-parameterization Tianxiang Gao Hailiang Liu Jia Liu Hridesh Rajan Hongyang Gao MLT 66 16 0 11 Oct 2021
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping James Martens Andy Ballard Guillaume Desjardins G. Swirszcz Valentin Dalibard Jascha Narain Sohl-Dickstein S. Schoenholz 124 45 0 05 Oct 2021
Stabilizing Equilibrium Models by Jacobian Regularization Shaojie Bai V. Koltun J. Zico Kolter 69 58 0 28 Jun 2021
On the validity of kernel approximations for orthogonally-initialized neural networks James Martens 31 3 0 13 Apr 2021
Deep Equilibrium Architectures for Inverse Problems in Imaging Davis Gilton Greg Ongie Rebecca Willett 78 181 0 16 Feb 2021
On the Theory of Implicit Deep Learning: Global Convergence with Implicit Layers Kenji Kawaguchi PINN 47 42 0 15 Feb 2021
Lipschitz Bounded Equilibrium Networks Max Revay Ruigang Wang I. Manchester 44 76 0 05 Oct 2020
Monotone operator equilibrium networks Ezra Winston J. Zico Kolter 62 130 0 15 Jun 2020
On the Neural Tangent Kernel of Deep Networks with Orthogonal Initialization Wei Huang Weitao Du R. Xu 50 37 0 13 Apr 2020
Dissecting Neural ODEs Stefano Massaroli Michael Poli Jinkyoo Park Atsushi Yamashita Hajime Asama 96 204 0 19 Feb 2020
Provable Benefit of Orthogonal Initialization in Optimizing Deep Linear Networks Wei Hu Lechao Xiao Jeffrey Pennington 64 113 0 16 Jan 2020
Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes Greg Yang 121 201 0 28 Oct 2019
Finite Depth and Width Corrections to the Neural Tangent Kernel Boris Hanin Mihai Nica MDE 71 151 0 13 Sep 2019
Deep Equilibrium Models Shaojie Bai J. Zico Kolter V. Koltun 94 671 0 03 Sep 2019
Implicit Deep Learning L. Ghaoui Fangda Gu Bertrand Travacca Armin Askari Alicia Y. Tsai AI4CE 64 180 0 17 Aug 2019
AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks B. Chang Minmin Chen E. Haber Ed H. Chi PINN GNN 110 207 0 26 Feb 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent Jaehoon Lee Lechao Xiao S. Schoenholz Yasaman Bahri Roman Novak Jascha Narain Sohl-Dickstein Jeffrey Pennington 211 1,106 0 18 Feb 2019
SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing Taku Kudo John Richardson 201 3,526 0 19 Aug 2018
Character-Level Language Modeling with Deeper Self-Attention Rami Al-Rfou Dokook Choe Noah Constant Mandy Guo Llion Jones 141 392 0 09 Aug 2018
Neural Ordinary Differential Equations T. Chen Yulia Rubanova J. Bettencourt David Duvenaud AI4CE 417 5,156 0 19 Jun 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 301 354 0 14 Jun 2018
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice Jeffrey Pennington S. Schoenholz Surya Ganguli ODL 43 253 0 13 Nov 2017
Pointer Sentinel Mixture Models Stephen Merity Caiming Xiong James Bradbury R. Socher RALM 328 2,895 0 26 Sep 2016
Exponential expressivity in deep neural networks through transient chaos Ben Poole Subhaneil Lahiri M. Raghu Jascha Narain Sohl-Dickstein Surya Ganguli 90 592 0 16 Jun 2016
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification Kaiming He Xinming Zhang Shaoqing Ren Jian Sun VLM 326 18,647 0 06 Feb 2015
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 181 1,849 0 20 Dec 2013