v1v2 (latest)

Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs

25 January 2019

Papers citing "Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs"

24 / 24 papers shown

Title
AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks B. Chang Minmin Chen E. Haber Ed H. Chi PINN GNN 110 204 0 26 Feb 2019
A Mean Field Theory of Batch Normalization Greg Yang Jeffrey Pennington Vinay Rao Jascha Narain Sohl-Dickstein S. Schoenholz 72 178 0 21 Feb 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent Jaehoon Lee Lechao Xiao S. Schoenholz Yasaman Bahri Roman Novak Jascha Narain Sohl-Dickstein Jeffrey Pennington 211 1,104 0 18 Feb 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation Greg Yang 160 287 0 13 Feb 2019
A Convergence Theory for Deep Learning via Over-Parameterization Zeyuan Allen-Zhu Yuanzhi Li Zhao Song AI4CE ODL 264 1,463 0 09 Nov 2018
h-detach: Modifying the LSTM Gradient Towards Better Optimization Devansh Arpit Bhargav Kanuparthi Giancarlo Kerg Nan Rosemary Ke Ioannis Mitliagkas Yoshua Bengio 85 32 0 06 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks S. Du Xiyu Zhai Barnabás Póczós Aarti Singh MLT ODL 224 1,272 0 04 Oct 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks Arthur Jacot Franck Gabriel Clément Hongler 269 3,203 0 20 Jun 2018
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks Minmin Chen Jeffrey Pennington S. Schoenholz SyDa AI4CE 57 116 0 14 Jun 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks Lechao Xiao Yasaman Bahri Jascha Narain Sohl-Dickstein S. Schoenholz Jeffrey Pennington 301 353 0 14 Jun 2018
Gaussian Process Behaviour in Wide Deep Neural Networks A. G. Matthews Mark Rowland Jiri Hron Richard Turner Zoubin Ghahramani BDL 149 559 0 30 Apr 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks Song Mei Andrea Montanari Phan-Minh Nguyen MLT 98 858 0 18 Apr 2018
Learning Longer-term Dependencies in RNNs with Auxiliary Losses Trieu H. Trinh Andrew M. Dai Thang Luong Quoc V. Le 80 180 0 01 Mar 2018
The Emergence of Spectral Universality in Deep Networks Jeffrey Pennington S. Schoenholz Surya Ganguli 61 173 0 27 Feb 2018
Mean Field Residual Networks: On the Edge of Chaos Greg Yang S. Schoenholz 67 192 0 24 Dec 2017
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice Jeffrey Pennington S. Schoenholz Surya Ganguli ODL 43 252 0 13 Nov 2017
Kronecker Recurrent Units C. Jose Moustapha Cissé François Fleuret ODL 79 46 0 29 May 2017
On orthogonality and learning recurrent networks with long term dependencies Eugene Vorontsov C. Trabelsi Samuel Kadoury C. Pal ODL 84 241 0 31 Jan 2017
Deep Information Propagation S. Schoenholz Justin Gilmer Surya Ganguli Jascha Narain Sohl-Dickstein 82 367 0 04 Nov 2016
Full-Capacity Unitary Recurrent Neural Networks Scott Wisdom Thomas Powers J. Hershey Jonathan Le Roux L. Atlas 55 293 0 31 Oct 2016
Exponential expressivity in deep neural networks through transient chaos Ben Poole Subhaneil Lahiri M. Raghu Jascha Narain Sohl-Dickstein Surya Ganguli 90 592 0 16 Jun 2016
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation Kyunghyun Cho B. V. Merrienboer Çağlar Gülçehre Dzmitry Bahdanau Fethi Bougares Holger Schwenk Yoshua Bengio AIMat 1.0K 23,354 0 03 Jun 2014
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks Andrew M. Saxe James L. McClelland Surya Ganguli ODL 178 1,849 0 20 Dec 2013
On the difficulty of training Recurrent Neural Networks Razvan Pascanu Tomas Mikolov Yoshua Bengio ODL 196 5,353 0 21 Nov 2012