Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.02926
Cited By
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
7 June 2019
Ryo Karakida
S. Akaho
S. Amari
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks"
29 / 29 papers shown
Title
Parallel Layer Normalization for Universal Approximation
Yunhao Ni
Yuhe Liu
Wenxin Sun
Yitong Tang
Yuxin Guo
Peilin Feng
Wenjun Wu
Lei Huang
20
0
0
19 May 2025
Non-identifiability distinguishes Neural Networks among Parametric Models
Sourav Chatterjee
Timothy Sudijono
40
0
0
25 Apr 2025
Transformers without Normalization
Jiachen Zhu
Xinlei Chen
Kaiming He
Yann LeCun
Zhuang Liu
ViT
OffRL
84
8
0
13 Mar 2025
Towards the Spectral bias Alleviation by Normalizations in Coordinate Networks
Zhicheng Cai
Hao Zhu
Qiu Shen
Xinran Wang
Xun Cao
78
0
0
25 Jul 2024
On the Nonlinearity of Layer Normalization
Yunhao Ni
Yuxin Guo
Junlong Jia
Lei Huang
52
5
0
03 Jun 2024
CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization
Yao Ni
Piotr Koniusz
AI4CE
GAN
45
1
0
31 Mar 2024
Neuro-Visualizer: An Auto-encoder-based Loss Landscape Visualization Method
Mohannad Elhamod
Anuj Karpatne
47
1
0
26 Sep 2023
Component-Wise Natural Gradient Descent -- An Efficient Neural Network Optimization
Tran van Sang
Mhd Irvan
R. Yamaguchi
Toshiyuki Nakata
26
1
0
11 Oct 2022
Analyzing Sharpness along GD Trajectory: Progressive Sharpening and Edge of Stability
Z. Li
Zixuan Wang
Jian Li
31
44
0
26 Jul 2022
Understanding the Generalization Benefit of Normalization Layers: Sharpness Reduction
Kaifeng Lyu
Zhiyuan Li
Sanjeev Arora
FAtt
54
71
0
14 Jun 2022
Beyond accuracy: generalization properties of bio-plausible temporal credit assignment rules
Yuhan Helena Liu
Arna Ghosh
Blake A. Richards
E. Shea-Brown
Guillaume Lajoie
53
9
0
02 Jun 2022
TorchNTK: A Library for Calculation of Neural Tangent Kernels of PyTorch Models
A. Engel
Zhichao Wang
Anand D. Sarwate
Sutanay Choudhury
Tony Chiang
47
3
0
24 May 2022
Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning
Ekdeep Singh Lubana
Robert P. Dick
Hidenori Tanaka
38
35
0
10 Jun 2021
Batch Normalization Orthogonalizes Representations in Deep Random Networks
Hadi Daneshmand
Amir Joudaki
Francis R. Bach
OOD
17
37
0
07 Jun 2021
Asymptotic Freeness of Layerwise Jacobians Caused by Invariance of Multilayer Perceptron: The Haar Orthogonal Case
B. Collins
Tomohiro Hayase
33
7
0
24 Mar 2021
ASAM: Adaptive Sharpness-Aware Minimization for Scale-Invariant Learning of Deep Neural Networks
Jungmin Kwon
Jeongseop Kim
Hyunseong Park
I. Choi
53
287
0
23 Feb 2021
Dissecting Hessian: Understanding Common Structure of Hessian in Neural Networks
Yikai Wu
Xingyu Zhu
Chenwei Wu
Annie Wang
Rong Ge
35
43
0
08 Oct 2020
Understanding Approximate Fisher Information for Fast Convergence of Natural Gradient Descent in Wide Neural Networks
Ryo Karakida
Kazuki Osawa
27
26
0
02 Oct 2020
Group Whitening: Balancing Learning Efficiency and Representational Capacity
Lei Huang
Yi Zhou
Li Liu
Fan Zhu
Ling Shao
38
21
0
28 Sep 2020
Normalization Techniques in Training DNNs: Methodology, Analysis and Application
Lei Huang
Jie Qin
Yi Zhou
Fan Zhu
Li Liu
Ling Shao
AI4CE
32
258
0
27 Sep 2020
Spherical Perspective on Learning with Normalization Layers
Simon Roburin
Yann de Mont-Marin
Andrei Bursuc
Renaud Marlet
P. Pérez
Mathieu Aubry
16
6
0
23 Jun 2020
When Does Preconditioning Help or Hurt Generalization?
S. Amari
Jimmy Ba
Roger C. Grosse
Xuechen Li
Atsushi Nitanda
Taiji Suzuki
Denny Wu
Ji Xu
41
32
0
18 Jun 2020
The Spectrum of Fisher Information of Deep Networks Achieving Dynamical Isometry
Tomohiro Hayase
Ryo Karakida
34
7
0
14 Jun 2020
Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks
Hadi Daneshmand
Jonas Köhler
Francis R. Bach
Thomas Hofmann
Aurelien Lucchi
OOD
ODL
10
4
0
03 Mar 2020
Any Target Function Exists in a Neighborhood of Any Sufficiently Wide Random Network: A Geometrical Perspective
S. Amari
32
12
0
20 Jan 2020
Pathological spectra of the Fisher information metric and its variants in deep neural networks
Ryo Karakida
S. Akaho
S. Amari
33
28
0
14 Oct 2019
Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10,000-Layer Vanilla Convolutional Neural Networks
Lechao Xiao
Yasaman Bahri
Jascha Narain Sohl-Dickstein
S. Schoenholz
Jeffrey Pennington
250
350
0
14 Jun 2018
Universal Statistics of Fisher Information in Deep Neural Networks: Mean Field Approach
Ryo Karakida
S. Akaho
S. Amari
FedML
54
141
0
04 Jun 2018
On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
N. Keskar
Dheevatsa Mudigere
J. Nocedal
M. Smelyanskiy
P. T. P. Tang
ODL
318
2,904
0
15 Sep 2016
1