Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2310.02012
Cited By
Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion
3 October 2023
Alexandru Meterez
Amir Joudaki
Francesco Orabona
Alexander Immer
Gunnar Rätsch
Hadi Daneshmand
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Towards Training Without Depth Limits: Batch Normalization Without Gradient Explosion"
5 / 5 papers shown
Title
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
39
3
0
29 May 2024
Rapid training of deep neural networks without skip connections or normalization layers using Deep Kernel Shaping
James Martens
Andy Ballard
Guillaume Desjardins
G. Swirszcz
Valentin Dalibard
Jascha Narain Sohl-Dickstein
S. Schoenholz
88
43
0
05 Oct 2021
Larger-Scale Transformers for Multilingual Masked Language Modeling
Naman Goyal
Jingfei Du
Myle Ott
Giridhar Anantharaman
Alexis Conneau
90
98
0
02 May 2021
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
227
348
0
14 Jun 2018
Global optimality conditions for deep neural networks
Chulhee Yun
S. Sra
Ali Jadbabaie
128
117
0
08 Jul 2017
1