Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.07013
Cited By
Understanding and Improving Layer Normalization
16 November 2019
Jingjing Xu
Xu Sun
Zhiyuan Zhang
Guangxiang Zhao
Junyang Lin
FAtt
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Understanding and Improving Layer Normalization"
50 / 129 papers shown
Title
CAGE: Controllable Articulation GEneration
Jiayi Liu
Hou In Ivan Tam
Ali Mahdavi-Amiri
Manolis Savva
41
18
0
15 Dec 2023
Improving Normalization with the James-Stein Estimator
Seyedalireza Khoshsirat
Chandra Kambhamettu
26
5
0
01 Dec 2023
Adapter is All You Need for Tuning Visual Tasks
Dongshuo Yin
Leiyi Hu
Bin Li
Youqun Zhang
18
15
0
25 Nov 2023
LATIS: Lambda Abstraction-based Thermal Image Super-resolution
Gargi Panda
Soumitra Kundu
Saumik Bhattacharya
Aurobinda Routray
38
0
0
18 Nov 2023
Sequence Length Independent Norm-Based Generalization Bounds for Transformers
Jacob Trauger
Ambuj Tewari
39
11
0
19 Oct 2023
Transformers are efficient hierarchical chemical graph learners
Zihan Pengmei
Zimu Li
Chih-chan Tien
Risi Kondor
Aaron R Dinner
GNN
23
1
0
02 Oct 2023
On Separate Normalization in Self-supervised Transformers
Xiaohui Chen
Yinkai Wang
Yuanqi Du
S. Hassoun
Liping Liu
ViT
24
1
0
22 Sep 2023
Mutual-Guided Dynamic Network for Image Fusion
Yuansheng Guan
Ruikang Xu
Mingde Yao
Lizhi Wang
Zhiwei Xiong
37
17
0
24 Aug 2023
ShadowNet for Data-Centric Quantum System Learning
Yuxuan Du
Yibo Yang
Tongliang Liu
Zhouchen Lin
Guohao Li
Dacheng Tao
42
7
0
22 Aug 2023
Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video
Yingxuan You
Hong Liu
Ti Wang
Wenhao Li
Runwei Ding
Xia Li
3DH
21
13
0
20 Aug 2023
HumanLiff: Layer-wise 3D Human Generation with Diffusion Model
Shou-Yong Hu
Fangzhou Hong
Tao Hu
Liang Pan
Haiyi Mei
Weiye Xiao
Lei Yang
Ziwei Liu
30
19
0
18 Aug 2023
SimMatchV2: Semi-Supervised Learning with Graph Consistency
Mingkai Zheng
Shan You
Lang Huang
Chen Luo
Fei Wang
Chao Qian
Chang Xu
SSL
29
7
0
13 Aug 2023
SODFormer: Streaming Object Detection with Transformer Using Events and Frames
Dianze Li
Jianing Li
Yonghong Tian
ViT
27
27
0
08 Aug 2023
SC VALL-E: Style-Controllable Zero-Shot Text to Speech Synthesizer
Daegyeom Kim
Seong-soo Hong
Yong-Hoon Choi
25
2
0
20 Jul 2023
A Bayesian approach to quantifying uncertainties and improving generalizability in traffic prediction models
Agnimitra Sengupta
Sudeepta Mondal
A. Das
S. I. Guler
BDL
UQCV
32
11
0
12 Jul 2023
Predicting small molecules solubilities on endpoint devices using deep ensemble neural networks
Mayk Caldas Ramos
Andrew D. White
33
0
0
11 Jul 2023
Scalable Neural Contextual Bandit for Recommender Systems
Zheqing Zhu
Benjamin Van Roy
OffRL
29
9
0
26 Jun 2023
Normalization Layers Are All That Sharpness-Aware Minimization Needs
Maximilian Mueller
Tiffany J. Vlaar
David Rolnick
Matthias Hein
27
18
0
07 Jun 2023
SING: A Plug-and-Play DNN Learning Technique
Adrien Courtois
Damien Scieur
Jean-Michel Morel
Pablo Arias
Thomas Eboli
36
0
0
25 May 2023
Pre-RMSNorm and Pre-CRMSNorm Transformers: Equivalent and Efficient Pre-LN Transformers
Zixuan Jiang
Jiaqi Gu
Hanqing Zhu
David Z. Pan
AI4CE
33
16
0
24 May 2023
Exploring the Impact of Layer Normalization for Zero-shot Neural Machine Translation
Zhuoyuan Mao
Raj Dabre
Qianying Liu
Haiyue Song
Chenhui Chu
Sadao Kurohashi
19
7
0
16 May 2023
On the Expressivity Role of LayerNorm in Transformers' Attention
Shaked Brody
Shiyu Jin
Xinghao Zhu
MoE
69
30
0
04 May 2023
An Empirical Analysis of the Shift and Scale Parameters in BatchNorm
Y. Peerthum
Mark Stamp
24
5
0
22 Mar 2023
Learning to Estimate Single-View Volumetric Flow Motions without 3D Supervision
Erik Franz
B. Solenthaler
N. Thürey
3DPC
MDE
28
2
0
28 Feb 2023
Stabilising and accelerating light gated recurrent units for automatic speech recognition
Adel Moumen
Titouan Parcollet
28
3
0
16 Feb 2023
Dual PatchNorm
Manoj Kumar
Mostafa Dehghani
N. Houlsby
UQCV
ViT
29
11
0
02 Feb 2023
Modality-Agnostic Variational Compression of Implicit Neural Representations
Jonathan Richard Schwarz
Jihoon Tack
Yee Whye Teh
Jaeho Lee
Jinwoo Shin
34
25
0
23 Jan 2023
Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
Chengyi Wang
Sanyuan Chen
Yu-Huan Wu
Zi-Hua Zhang
Long Zhou
...
Huaming Wang
Jinyu Li
Lei He
Sheng Zhao
Furu Wei
48
648
0
05 Jan 2023
On Transforming Reinforcement Learning by Transformer: The Development Trajectory
Shengchao Hu
Li Shen
Ya Zhang
Yixin Chen
Dacheng Tao
OffRL
27
25
0
29 Dec 2022
Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models
Wang Qi
Yu-Ping Ruan
Y. Zuo
Taihao Li
27
18
0
16 Nov 2022
How Does a Deep Learning Model Architecture Impact Its Privacy? A Comprehensive Study of Privacy Attacks on CNNs and Transformers
Guangsheng Zhang
B. Liu
Huan Tian
Tianqing Zhu
Ming Ding
Wanlei Zhou
PILM
MIACV
20
5
0
20 Oct 2022
MTet: Multi-domain Translation for English and Vietnamese
C. Ngo
Trieu H. Trinh
Long Phan
H. Tran
Tai Dang
Hieu Duy Nguyen
Minh Le Nguyen
Minh-Thang Luong
VLM
37
8
0
11 Oct 2022
A Transformer-based deep neural network model for SSVEP classification
Jianbo Chen
Yangsong Zhang
Yudong Pan
Peng Xu
Cuntai Guan
22
50
0
09 Oct 2022
Breaking Time Invariance: Assorted-Time Normalization for RNNs
Cole Pospisil
Vasily Zadorozhnyy
Qiang Ye
21
0
0
28 Sep 2022
Batch Layer Normalization, A new normalization layer for CNNs and RNN
A. Ziaee
Erion cCano
19
13
0
19 Sep 2022
Orthogonal Gated Recurrent Unit with Neumann-Cayley Transformation
Edison Mucllari
Vasily Zadorozhnyy
Cole Pospisil
D. Nguyen
Qiang Ye
41
3
0
12 Aug 2022
InitialGAN: A Language GAN with Completely Random Initialization
Da Ren
Qing Li
GAN
32
2
0
04 Aug 2022
Unified Normalization for Accelerating and Stabilizing Transformers
Qiming Yang
Kai Zhang
Chaoxiang Lan
Zhi Yang
Zheyang Li
Wenming Tan
Jun Xiao
Shiliang Pu
17
8
0
02 Aug 2022
Search for or Navigate to? Dual Adaptive Thinking for Object Navigation
Ronghao Dang
Liuyi Wang
Zongtao He
Shuai Su
Chengju Liu
Qi Chen
14
16
0
01 Aug 2022
Understanding and Improving Group Normalization
Agus Gunawan
Xu Yin
Kang Zhang
15
3
0
05 Jul 2022
Revisiting lp-constrained Softmax Loss: A Comprehensive Study
C. Trivedi
Konstantinos Makantasis
Antonios Liapis
Georgios N. Yannakakis
13
0
0
20 Jun 2022
B2T Connection: Serving Stability and Performance in Deep Transformers
Sho Takase
Shun Kiyono
Sosuke Kobayashi
Jun Suzuki
18
10
0
01 Jun 2022
THE-X: Privacy-Preserving Transformer Inference with Homomorphic Encryption
Tianyu Chen
Hangbo Bao
Shaohan Huang
Li Dong
Binxing Jiao
Daxin Jiang
Haoyi Zhou
Jianxin Li
Furu Wei
23
96
0
01 Jun 2022
Batch Normalization Is Blind to the First and Second Derivatives of the Loss
Zhanpeng Zhou
Wen Shen
Huixin Chen
Ling Tang
Quanshi Zhang
34
2
0
30 May 2022
Mitigating Neural Network Overconfidence with Logit Normalization
Hongxin Wei
Renchunzi Xie
Hao-Ran Cheng
Lei Feng
Bo An
Yixuan Li
OODD
163
267
0
19 May 2022
Incorporating Dynamic Semantics into Pre-Trained Language Model for Aspect-based Sentiment Analysis
Kai Zhang
Kunpeng Zhang
Mengdi Zhang
Hongke Zhao
Qi Liu
Wei Wu
Enhong Chen
11
51
0
30 Mar 2022
Continual Normalization: Rethinking Batch Normalization for Online Continual Learning
Quang Pham
Chenghao Liu
Guosheng Lin
BDL
OnRL
38
57
0
30 Mar 2022
TrimBERT: Tailoring BERT for Trade-offs
S. N. Sridhar
Anthony Sarah
Sairam Sundaresan
MQ
26
4
0
24 Feb 2022
Self-Awareness Safety of Deep Reinforcement Learning in Road Traffic Junction Driving
Zehong Cao
Jie Yun
19
6
0
20 Jan 2022
AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization
Wendyam Eric Lionel Ilboudo
Taisuke Kobayashi
Takamitsu Matsubara
39
12
0
18 Jan 2022
Previous
1
2
3
Next