ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.07013
  4. Cited By
Understanding and Improving Layer Normalization

Understanding and Improving Layer Normalization

16 November 2019
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
    FAtt
ArXivPDFHTML

Papers citing "Understanding and Improving Layer Normalization"

29 / 129 papers shown
Title
Dynamic Token Normalization Improves Vision Transformers
Dynamic Token Normalization Improves Vision Transformers
Wenqi Shao
Yixiao Ge
Zhaoyang Zhang
Xuyuan Xu
Xiaogang Wang
Ying Shan
Ping Luo
ViT
129
11
0
05 Dec 2021
FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer
  Using Neural Generative Adversarial Networks
FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer Using Neural Generative Adversarial Networks
Nicolas Olivier
Kelian Baert
F. Danieau
Franck Multon
Quentin Avril
CVBM
3DH
41
18
0
01 Dec 2021
Critical Initialization of Wide and Deep Neural Networks through Partial
  Jacobians: General Theory and Applications
Critical Initialization of Wide and Deep Neural Networks through Partial Jacobians: General Theory and Applications
Darshil Doshi
Tianyu He
Andrey Gromov
30
8
0
23 Nov 2021
A Survey on Green Deep Learning
A Survey on Green Deep Learning
Jingjing Xu
Wangchunshu Zhou
Zhiyi Fu
Hao Zhou
Lei Li
VLM
81
83
0
08 Nov 2021
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Dropout Q-Functions for Doubly Efficient Reinforcement Learning
Takuya Hiraoka
Takahisa Imagawa
Taisei Hashimoto
Takashi Onishi
Yoshimasa Tsuruoka
11
105
0
05 Oct 2021
Batch Normalization Preconditioning for Neural Network Training
Batch Normalization Preconditioning for Neural Network Training
Susanna Lange
Kyle E. Helfrich
Qiang Ye
27
9
0
02 Aug 2021
Transformer-based Map Matching Model with Limited Ground-Truth Data
  using Transfer-Learning Approach
Transformer-based Map Matching Model with Limited Ground-Truth Data using Transfer-Learning Approach
Zhixiong Jin
Jiwon Kim
H. Yeo
Seongjin Choi
35
27
0
01 Aug 2021
Beyond BatchNorm: Towards a Unified Understanding of Normalization in
  Deep Learning
Beyond BatchNorm: Towards a Unified Understanding of Normalization in Deep Learning
Ekdeep Singh Lubana
Robert P. Dick
Hidenori Tanaka
35
35
0
10 Jun 2021
A Survey of Transformers
A Survey of Transformers
Tianyang Lin
Yuxin Wang
Xiangyang Liu
Xipeng Qiu
ViT
53
1,088
0
08 Jun 2021
Adversarially Adaptive Normalization for Single Domain Generalization
Adversarially Adaptive Normalization for Single Domain Generalization
Xinjie Fan
Qifei Wang
Junjie Ke
Feng Yang
Boqing Gong
Mingyuan Zhou
27
129
0
01 Jun 2021
Rethinking Skip Connection with Layer Normalization in Transformers and
  ResNets
Rethinking Skip Connection with Layer Normalization in Transformers and ResNets
Fenglin Liu
Xuancheng Ren
Zhiyuan Zhang
Xu Sun
Yuexian Zou
AI4CE
29
67
0
15 May 2021
BERT Busters: Outlier Dimensions that Disrupt Transformers
BERT Busters: Outlier Dimensions that Disrupt Transformers
Olga Kovaleva
Saurabh Kulshreshtha
Anna Rogers
Anna Rumshisky
19
85
0
14 May 2021
When Attention Meets Fast Recurrence: Training Language Models with
  Reduced Compute
When Attention Meets Fast Recurrence: Training Language Models with Reduced Compute
Tao Lei
RALM
VLM
59
47
0
24 Feb 2021
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
23
2,132
0
23 Dec 2020
Batch Group Normalization
Batch Group Normalization
Xiao-Yun Zhou
Jiacheng Sun
Nanyang Ye
Xu Lan
Qijun Luo
Bolin Lai
P. Esperança
Guang-Zhong Yang
Zhenguo Li
24
17
0
04 Dec 2020
Generative Layout Modeling using Constraint Graphs
Generative Layout Modeling using Constraint Graphs
W. Para
Paul Guerrero
Tom Kelly
Leonidas J. Guibas
Peter Wonka
31
68
0
26 Nov 2020
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image
  Generation
DTGAN: Dual Attention Generative Adversarial Networks for Text-to-Image Generation
Zhenxing Zhang
Lambert Schomaker
GAN
31
34
0
05 Nov 2020
Unsupervised Bitext Mining and Translation via Self-trained Contextual
  Embeddings
Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings
Phillip Keung
Julian Salazar
Y. Lu
Noah A. Smith
SSL
27
25
0
15 Oct 2020
Query-Key Normalization for Transformers
Query-Key Normalization for Transformers
Alex Henry
Prudhvi Raj Dachapally
S. Pawar
Yuxuan Chen
17
75
0
08 Oct 2020
STIL -- Simultaneous Slot Filling, Translation, Intent Classification,
  and Language Identification: Initial Results using mBART on MultiATIS++
STIL -- Simultaneous Slot Filling, Translation, Intent Classification, and Language Identification: Initial Results using mBART on MultiATIS++
Jack G. M. FitzGerald
19
13
0
02 Oct 2020
Group Whitening: Balancing Learning Efficiency and Representational
  Capacity
Group Whitening: Balancing Learning Efficiency and Representational Capacity
Lei Huang
Yi Zhou
Li Liu
Fan Zhu
Ling Shao
28
21
0
28 Sep 2020
Normalization Techniques in Training DNNs: Methodology, Analysis and
  Application
Normalization Techniques in Training DNNs: Methodology, Analysis and Application
Lei Huang
Jie Qin
Yi Zhou
Fan Zhu
Li Liu
Ling Shao
AI4CE
12
255
0
27 Sep 2020
Review: Deep Learning in Electron Microscopy
Review: Deep Learning in Electron Microscopy
Jeffrey M. Ede
34
79
0
17 Sep 2020
Very Deep Transformers for Neural Machine Translation
Very Deep Transformers for Neural Machine Translation
Xiaodong Liu
Kevin Duh
Liyuan Liu
Jianfeng Gao
8
102
0
18 Aug 2020
Matrix Shuffle-Exchange Networks for Hard 2D Tasks
Matrix Shuffle-Exchange Networks for Hard 2D Tasks
Emīls Ozoliņš
Kārlis Freivalds
A. Sostaks
13
0
0
29 Jun 2020
Correct Normalization Matters: Understanding the Effect of Normalization
  On Deep Neural Network Models For Click-Through Rate Prediction
Correct Normalization Matters: Understanding the Effect of Normalization On Deep Neural Network Models For Click-Through Rate Prediction
Zhiqiang Wang
Qingyun She
Pengtao Zhang
Junlin Zhang
21
7
0
23 Jun 2020
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Andis Draguns
Emīls Ozoliņš
A. Sostaks
Matiss Apinis
Kārlis Freivalds
11
8
0
06 Apr 2020
PowerNorm: Rethinking Batch Normalization in Transformers
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
24
16
0
17 Mar 2020
Implicit Regularization and Convergence for Weight Normalization
Implicit Regularization and Convergence for Weight Normalization
Xiaoxia Wu
Yan Sun
Tongzheng Ren
Shanshan Wu
Zhiyuan Li
Suriya Gunasekar
Rachel A. Ward
Qiang Liu
20
21
0
18 Nov 2019
Previous
123