Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.05393
Cited By
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
14 June 2018
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks"
50 / 70 papers shown
Title
Don't be lazy: CompleteP enables compute-efficient deep transformers
Nolan Dey
Bin Claire Zhang
Lorenzo Noci
Mufan Bill Li
Blake Bordelon
Shane Bergsma
C. Pehlevan
Boris Hanin
Joel Hestness
39
0
0
02 May 2025
AlphaGrad: Non-Linear Gradient Normalization Optimizer
Soham Sane
ODL
56
0
0
22 Apr 2025
Fast Training of Sinusoidal Neural Fields via Scaling Initialization
Taesun Yeom
Sangyoon Lee
Jaeho Lee
55
2
0
07 Oct 2024
Parseval Convolution Operators and Neural Networks
Michael Unser
Stanislas Ducotterd
25
3
0
19 Aug 2024
Equivariant Neural Tangent Kernels
Philipp Misof
Pan Kessel
Jan E. Gerken
61
0
0
10 Jun 2024
Understanding and Minimising Outlier Features in Neural Network Training
Bobby He
Lorenzo Noci
Daniele Paliotta
Imanol Schlag
Thomas Hofmann
36
3
0
29 May 2024
On the Neural Tangent Kernel of Equilibrium Models
Zhili Feng
J. Zico Kolter
18
6
0
21 Oct 2023
Dynamical Isometry based Rigorous Fair Neural Architecture Search
Jianxiang Luo
Junyi Hu
Tianji Pang
Weihao Huang
Chuan-Hsi Liu
21
0
0
05 Jul 2023
Spike-driven Transformer
Man Yao
Jiakui Hu
Zhaokun Zhou
Liuliang Yuan
Yonghong Tian
Boxing Xu
Guoqi Li
34
114
0
04 Jul 2023
Unraveling Projection Heads in Contrastive Learning: Insights from Expansion and Shrinkage
Yu Gui
Cong Ma
Yiqiao Zhong
22
6
0
06 Jun 2023
Robust low-rank training via approximate orthonormal constraints
Dayana Savostianova
Emanuele Zangrando
Gianluca Ceruti
Francesco Tudisco
24
9
0
02 Jun 2023
TIPS: Topologically Important Path Sampling for Anytime Neural Networks
Guihong Li
Kartikeya Bhardwaj
Yuedong Yang
R. Marculescu
AAML
36
0
0
13 May 2023
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks
Eshaan Nichani
Alexandru Damian
Jason D. Lee
MLT
38
13
0
11 May 2023
Criticality versus uniformity in deep neural networks
A. Bukva
Jurriaan de Gier
Kevin T. Grosvenor
R. Jefferson
K. Schalm
Eliot Schwander
28
3
0
10 Apr 2023
On the Initialisation of Wide Low-Rank Feedforward Neural Networks
Thiziri Nait Saada
Jared Tanner
13
1
0
31 Jan 2023
Why is the State of Neural Network Pruning so Confusing? On the Fairness, Comparison Setup, and Trainability in Network Pruning
Huan Wang
Can Qin
Yue Bai
Yun Fu
32
20
0
12 Jan 2023
Orthogonal SVD Covariance Conditioning and Latent Disentanglement
Yue Song
N. Sebe
Wei Wang
26
6
0
11 Dec 2022
Statistical Physics of Deep Neural Networks: Initialization toward Optimal Channels
Kangyu Weng
Aohua Cheng
Ziyang Zhang
Pei Sun
Yang Tian
48
2
0
04 Dec 2022
Improved techniques for deterministic l2 robustness
Sahil Singla
S. Feizi
AAML
23
9
0
15 Nov 2022
Proximal Mean Field Learning in Shallow Neural Networks
Alexis M. H. Teter
Iman Nodozi
A. Halder
FedML
43
1
0
25 Oct 2022
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
13
1
0
11 Oct 2022
On skip connections and normalisation layers in deep optimisation
L. MacDonald
Jack Valmadre
Hemanth Saratchandran
Simon Lucey
ODL
19
1
0
10 Oct 2022
Dynamical Isometry for Residual Networks
Advait Gadhikar
R. Burkholz
ODL
AI4CE
40
2
0
05 Oct 2022
Dynamical systems' based neural networks
E. Celledoni
Davide Murari
B. Owren
Carola-Bibiane Schönlieb
Ferdia Sherry
OOD
40
10
0
05 Oct 2022
Neural Networks Reduction via Lumping
Dalila Ressi
Riccardo Romanello
S. Rossi
Carla Piazza
30
4
0
15 Sep 2022
Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality
Yue Song
N. Sebe
Wei Wang
16
8
0
05 Jul 2022
Fast Finite Width Neural Tangent Kernel
Roman Novak
Jascha Narain Sohl-Dickstein
S. Schoenholz
AAML
20
53
0
17 Jun 2022
Feedback Gradient Descent: Efficient and Stable Optimization with Orthogonality for DNNs
Fanchen Bu
D. Chang
28
6
0
12 May 2022
Deep Learning without Shortcuts: Shaping the Kernel with Tailored Rectifiers
Guodong Zhang
Aleksandar Botev
James Martens
OffRL
21
26
0
15 Mar 2022
projUNN: efficient method for training deep networks with unitary matrices
B. Kiani
Randall Balestriero
Yann LeCun
S. Lloyd
41
32
0
10 Mar 2022
A Johnson--Lindenstrauss Framework for Randomly Initialized CNNs
Ido Nachum
Jan Hkazla
Michael C. Gastpar
Anatoly Khina
33
0
0
03 Nov 2021
Ridgeless Interpolation with Shallow ReLU Networks in
1
D
1D
1
D
is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions
Boris Hanin
MLT
38
9
0
27 Sep 2021
Orthogonal Graph Neural Networks
Kai Guo
Kaixiong Zhou
Xia Hu
Yu Li
Yi Chang
Xin Wang
43
34
0
23 Sep 2021
AutoInit: Analytic Signal-Preserving Weight Initialization for Neural Networks
G. Bingham
Risto Miikkulainen
ODL
24
4
0
18 Sep 2021
Existence, Stability and Scalability of Orthogonal Convolutional Neural Networks
E. M. Achour
Franccois Malgouyres
Franck Mamalet
16
20
0
12 Aug 2021
Towards quantifying information flows: relative entropy in deep neural networks and the renormalization group
J. Erdmenger
Kevin T. Grosvenor
R. Jefferson
54
17
0
14 Jul 2021
Marginalizable Density Models
D. Gilboa
Ari Pakman
Thibault Vatter
BDL
32
5
0
08 Jun 2021
Going deeper with Image Transformers
Hugo Touvron
Matthieu Cord
Alexandre Sablayrolles
Gabriel Synnaeve
Hervé Jégou
ViT
27
986
0
31 Mar 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
B. Collins
Tomohiro Hayase
22
7
0
24 Mar 2021
RepVGG: Making VGG-style ConvNets Great Again
Xiaohan Ding
Xinming Zhang
Ningning Ma
Jungong Han
Guiguang Ding
Jian Sun
136
1,548
0
11 Jan 2021
Advances in Electron Microscopy with Deep Learning
Jeffrey M. Ede
32
2
0
04 Jan 2021
StackRec: Efficient Training of Very Deep Sequential Recommender Models by Iterative Stacking
Jiachun Wang
Fajie Yuan
Jian Chen
Qingyao Wu
Min Yang
Yang Sun
Guoxiao Zhang
BDL
40
26
0
14 Dec 2020
BYOL works even without batch statistics
Pierre Harvey Richemond
Jean-Bastien Grill
Florent Altché
Corentin Tallec
Florian Strub
...
Samuel L. Smith
Soham De
Razvan Pascanu
Bilal Piot
Michal Valko
SSL
250
114
0
20 Oct 2020
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
31
79
0
17 Sep 2020
Whitening and second order optimization both make information in the dataset unusable during training, and can reduce or prevent generalization
Neha S. Wadia
Daniel Duckworth
S. Schoenholz
Ethan Dyer
Jascha Narain Sohl-Dickstein
27
13
0
17 Aug 2020
Beyond Signal Propagation: Is Feature Diversity Necessary in Deep Neural Network Initialization?
Yaniv Blumenfeld
D. Gilboa
Daniel Soudry
ODL
22
13
0
02 Jul 2020
Deep Isometric Learning for Visual Recognition
Haozhi Qi
Chong You
Xinyu Wang
Yi Ma
Jitendra Malik
VLM
30
53
0
30 Jun 2020
Tensor Programs II: Neural Tangent Kernel for Any Architecture
Greg Yang
48
134
0
25 Jun 2020
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry
Tomohiro Hayase
Ryo Karakida
27
7
0
14 Jun 2020
Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks
Soham De
Samuel L. Smith
ODL
14
20
0
24 Feb 2020
1
2
Next