ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1901.08987
  4. Cited By
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs
v1v2 (latest)

Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs

25 January 2019
D. Gilboa
B. Chang
Minmin Chen
Greg Yang
S. Schoenholz
Ed H. Chi
Jeffrey Pennington
ArXiv (abs)PDFHTML

Papers citing "Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs"

24 / 24 papers shown
Title
AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks
AntisymmetricRNN: A Dynamical System View on Recurrent Neural Networks
B. Chang
Minmin Chen
E. Haber
Ed H. Chi
PINNGNN
110
204
0
26 Feb 2019
A Mean Field Theory of Batch Normalization
A Mean Field Theory of Batch Normalization
Greg Yang
Jeffrey Pennington
Vinay Rao
Jascha Narain Sohl-Dickstein
S. Schoenholz
72
178
0
21 Feb 2019
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient
  Descent
Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent
Jaehoon Lee
Lechao Xiao
S. Schoenholz
Yasaman Bahri
Roman Novak
Jascha Narain Sohl-Dickstein
Jeffrey Pennington
211
1,104
0
18 Feb 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian
  Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation
Greg Yang
160
287
0
13 Feb 2019
A Convergence Theory for Deep Learning via Over-Parameterization
A Convergence Theory for Deep Learning via Over-Parameterization
Zeyuan Allen-Zhu
Yuanzhi Li
Zhao Song
AI4CEODL
264
1,463
0
09 Nov 2018
h-detach: Modifying the LSTM Gradient Towards Better Optimization
h-detach: Modifying the LSTM Gradient Towards Better Optimization
Devansh Arpit
Bhargav Kanuparthi
Giancarlo Kerg
Nan Rosemary Ke
Ioannis Mitliagkas
Yoshua Bengio
85
32
0
06 Oct 2018
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
Gradient Descent Provably Optimizes Over-parameterized Neural Networks
S. Du
Xiyu Zhai
Barnabás Póczós
Aarti Singh
MLTODL
224
1,272
0
04 Oct 2018
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Neural Tangent Kernel: Convergence and Generalization in Neural Networks
Arthur Jacot
Franck Gabriel
Clément Hongler
269
3,203
0
20 Jun 2018
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables
  Signal Propagation in Recurrent Neural Networks
Dynamical Isometry and a Mean Field Theory of RNNs: Gating Enables Signal Propagation in Recurrent Neural Networks
Minmin Chen
Jeffrey Pennington
S. Schoenholz
SyDaAI4CE
57
116
0
14 Jun 2018
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train
  10,000-Layer Vanilla Convolutional Neural Networks
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
301
353
0
14 Jun 2018
Gaussian Process Behaviour in Wide Deep Neural Networks
Gaussian Process Behaviour in Wide Deep Neural Networks
A. G. Matthews
Mark Rowland
Jiri Hron
Richard Turner
Zoubin Ghahramani
BDL
149
559
0
30 Apr 2018
A Mean Field View of the Landscape of Two-Layers Neural Networks
A Mean Field View of the Landscape of Two-Layers Neural Networks
Song Mei
Andrea Montanari
Phan-Minh Nguyen
MLT
98
858
0
18 Apr 2018
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Learning Longer-term Dependencies in RNNs with Auxiliary Losses
Trieu H. Trinh
Andrew M. Dai
Thang Luong
Quoc V. Le
80
180
0
01 Mar 2018
The Emergence of Spectral Universality in Deep Networks
The Emergence of Spectral Universality in Deep Networks
Jeffrey Pennington
S. Schoenholz
Surya Ganguli
61
173
0
27 Feb 2018
Mean Field Residual Networks: On the Edge of Chaos
Mean Field Residual Networks: On the Edge of Chaos
Greg Yang
S. Schoenholz
67
192
0
24 Dec 2017
Resurrecting the sigmoid in deep learning through dynamical isometry:
  theory and practice
Resurrecting the sigmoid in deep learning through dynamical isometry: theory and practice
Jeffrey Pennington
S. Schoenholz
Surya Ganguli
ODL
43
252
0
13 Nov 2017
Kronecker Recurrent Units
Kronecker Recurrent Units
C. Jose
Moustapha Cissé
François Fleuret
ODL
79
46
0
29 May 2017
On orthogonality and learning recurrent networks with long term
  dependencies
On orthogonality and learning recurrent networks with long term dependencies
Eugene Vorontsov
C. Trabelsi
Samuel Kadoury
C. Pal
ODL
84
241
0
31 Jan 2017
Deep Information Propagation
Deep Information Propagation
S. Schoenholz
Justin Gilmer
Surya Ganguli
Jascha Narain Sohl-Dickstein
82
367
0
04 Nov 2016
Full-Capacity Unitary Recurrent Neural Networks
Full-Capacity Unitary Recurrent Neural Networks
Scott Wisdom
Thomas Powers
J. Hershey
Jonathan Le Roux
L. Atlas
55
293
0
31 Oct 2016
Exponential expressivity in deep neural networks through transient chaos
Exponential expressivity in deep neural networks through transient chaos
Ben Poole
Subhaneil Lahiri
M. Raghu
Jascha Narain Sohl-Dickstein
Surya Ganguli
90
592
0
16 Jun 2016
Learning Phrase Representations using RNN Encoder-Decoder for
  Statistical Machine Translation
Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation
Kyunghyun Cho
B. V. Merrienboer
Çağlar Gülçehre
Dzmitry Bahdanau
Fethi Bougares
Holger Schwenk
Yoshua Bengio
AIMat
1.0K
23,354
0
03 Jun 2014
Exact solutions to the nonlinear dynamics of learning in deep linear
  neural networks
Exact solutions to the nonlinear dynamics of learning in deep linear neural networks
Andrew M. Saxe
James L. McClelland
Surya Ganguli
ODL
178
1,849
0
20 Dec 2013
On the difficulty of training Recurrent Neural Networks
On the difficulty of training Recurrent Neural Networks
Razvan Pascanu
Tomas Mikolov
Yoshua Bengio
ODL
196
5,353
0
21 Nov 2012
1