A Mean Field Theory of Batch Normalization

A Mean Field Theory of Batch Normalization

21 February 2019

Jeffrey Pennington

Jascha Narain Sohl-Dickstein

Papers citing "A Mean Field Theory of Batch Normalization"

9 / 9 papers shown

Title
On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning Alvaro Arroyo Alessio Gravina Benjamin Gutteridge Federico Barbero Claudio Gallicchio Xiaowen Dong Michael M. Bronstein P. Vandergheynst 111 10 0 15 Feb 2025
Residual Connections and Normalization Can Provably Prevent Oversmoothing in GNNs Michael Scholkemper Xinyi Wu Ali Jadbabaie Michael T. Schaub 187 8 0 05 Jun 2024
Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes Greg Yang 109 199 0 28 Oct 2019
Scaling Limits of Wide Neural Networks with Weight Sharing: Gaussian Process Behavior, Gradient Independence, and Neural Tangent Kernel Derivation Greg Yang 141 287 0 13 Feb 2019
Dynamical Isometry and a Mean Field Theory of LSTMs and GRUs D. Gilboa B. Chang Minmin Chen Greg Yang S. Schoenholz Ed H. Chi Jeffrey Pennington 70 42 0 25 Jan 2019
Understanding Batch Normalization Johan Bjorck Carla P. Gomes B. Selman Kilian Q. Weinberger 146 610 0 01 Jun 2018
Deep Neural Networks as Gaussian Processes Jaehoon Lee Yasaman Bahri Roman Novak S. Schoenholz Jeffrey Pennington Jascha Narain Sohl-Dickstein UQCV BDL 115 1,093 0 01 Nov 2017
The Shattered Gradients Problem: If resnets are the answer, then what is the question? David Balduzzi Marcus Frean Lennox Leary J. P. Lewis Kurt Wan-Duo Ma Brian McWilliams ODL 68 404 0 28 Feb 2017
Exponential expressivity in deep neural networks through transient chaos Ben Poole Subhaneil Lahiri M. Raghu Jascha Narain Sohl-Dickstein Surya Ganguli 88 591 0 16 Jun 2016