ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,528 papers shown
Title
Adaptive Prediction Timing for Electronic Health Records
Adaptive Prediction Timing for Electronic Health Records
J. Deasy
A. Ercole
Pietro Lio
OOD
19
1
0
05 Mar 2020
q-VAE for Disentangled Representation Learning and Latent Dynamical
  Systems
q-VAE for Disentangled Representation Learning and Latent Dynamical Systems
Taisuke Kobayashis
BDL
DRL
41
17
0
04 Mar 2020
Deep Learning in Memristive Nanowire Networks
Deep Learning in Memristive Nanowire Networks
Jack D. Kendall
Ross D. Pantone
J. Nino
6
2
0
03 Mar 2020
Batch Normalization Provably Avoids Rank Collapse for Randomly
  Initialised Deep Networks
Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks
Hadi Daneshmand
Jonas Köhler
Francis R. Bach
Thomas Hofmann
Aurelien Lucchi
OOD
ODL
10
4
0
03 Mar 2020
Meta-Embeddings Based On Self-Attention
Meta-Embeddings Based On Self-Attention
Qichen Li
Xiaoke Jiang
Jun Xia
Jian Li
21
2
0
03 Mar 2020
Curriculum By Smoothing
Curriculum By Smoothing
Samarth Sinha
Animesh Garg
Hugo Larochelle
21
7
0
03 Mar 2020
Benchmarking Graph Neural Networks
Benchmarking Graph Neural Networks
Vijay Prakash Dwivedi
Chaitanya K. Joshi
Anh Tuan Luu
T. Laurent
Yoshua Bengio
Xavier Bresson
194
927
0
02 Mar 2020
Transformer++
Transformer++
Prakhar Thapak
P. Hore
14
0
0
02 Mar 2020
Style Example-Guided Text Generation using Generative Adversarial
  Transformers
Style Example-Guided Text Generation using Generative Adversarial Transformers
Kuo-Hao Zeng
Mohammad Shoeybi
Ming-Yuan Liu
GAN
23
18
0
02 Mar 2020
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI
  Components by Deep Learning
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning
Jieshan Chen
Chunyang Chen
Zhenchang Xing
Xiwei Xu
Liming Zhu
Guoqiang Li
Jinshui Wang
19
139
0
01 Mar 2020
Channel Equilibrium Networks for Learning Deep Representation
Channel Equilibrium Networks for Learning Deep Representation
Wenqi Shao
Shitao Tang
Xingang Pan
Ping Tan
Xiaogang Wang
Ping Luo
30
17
0
29 Feb 2020
Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image
  Translation
Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image Translation
Takehiko Ohkawa
Naoto Inoue
Hirokatsu Kataoka
Nakamasa Inoue
37
6
0
29 Feb 2020
Two Routes to Scalable Credit Assignment without Weight Symmetry
Two Routes to Scalable Credit Assignment without Weight Symmetry
D. Kunin
Aran Nayebi
Javier Sagastuy-Breña
Surya Ganguli
Jonathan M. Bloom
Daniel L. K. Yamins
38
32
0
28 Feb 2020
RP-DNN: A Tweet level propagation context based deep neural networks for
  early rumor detection in Social Media
RP-DNN: A Tweet level propagation context based deep neural networks for early rumor detection in Social Media
Jie Gao
Sooji Han
Xingyi Song
F. Ciravegna
28
20
0
28 Feb 2020
Modeling Future Cost for Neural Machine Translation
Modeling Future Cost for Neural Machine Translation
Chaoqun Duan
Kehai Chen
Rui Wang
Masao Utiyama
Eiichiro Sumita
Conghui Zhu
Tiejun Zhao
AI4TS
30
15
0
28 Feb 2020
Advances in Collaborative Filtering and Ranking
Advances in Collaborative Filtering and Ranking
Liwei Wu
22
7
0
27 Feb 2020
Deep Residual-Dense Lattice Network for Speech Enhancement
Deep Residual-Dense Lattice Network for Speech Enhancement
M. Nikzad
Aaron Nicolson
Yongsheng Gao
Jun Zhou
K. Paliwal
Fanhua Shang
14
38
0
27 Feb 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training
  and Inference of Transformers
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
32
149
0
26 Feb 2020
Refined Gate: A Simple and Effective Gating Mechanism for Recurrent
  Units
Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units
Zhanzhan Cheng
Yunlu Xu
Mingjian Cheng
Yu Qiao
Shiliang Pu
Yi Niu
Fei Wu
16
8
0
26 Feb 2020
On Feature Normalization and Data Augmentation
On Feature Normalization and Data Augmentation
Boyi Li
Felix Wu
Ser-Nam Lim
Serge J. Belongie
Kilian Q. Weinberger
26
134
0
25 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression
  of Pre-Trained Transformers
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
54
1,224
0
25 Feb 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of
  DNNs
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs
Lei Huang
Jie Qin
Li Liu
Fan Zhu
Ling Shao
AI4CE
31
11
0
25 Feb 2020
Exploring BERT Parameter Efficiency on the Stanford Question Answering
  Dataset v2.0
Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0
Eric Hulburd
14
5
0
25 Feb 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via
  Pre-training
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training
Weituo Hao
Chunyuan Li
Xiujun Li
Lawrence Carin
Jianfeng Gao
LM&Ro
29
275
0
25 Feb 2020
Batch norm with entropic regularization turns deterministic autoencoders
  into generative models
Batch norm with entropic regularization turns deterministic autoencoders into generative models
Amur Ghose
Abdullah M. Rashwan
Pascal Poupart
UQCV
18
8
0
25 Feb 2020
Batch Normalization Biases Residual Blocks Towards the Identity Function
  in Deep Networks
Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks
Soham De
Samuel L. Smith
ODL
32
20
0
24 Feb 2020
End-to-End Neural Diarization: Reformulating Speaker Diarization as
  Simple Multi-label Classification
End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification
Yusuke Fujita
Shinji Watanabe
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
23
49
0
24 Feb 2020
GRET: Global Representation Enhanced Transformer
GRET: Global Representation Enhanced Transformer
Rongxiang Weng
Hao-Ran Wei
Shujian Huang
Heng Yu
Lidong Bing
Weihua Luo
Jiajun Chen
27
9
0
24 Feb 2020
On Hiding Neural Networks Inside Neural Networks
On Hiding Neural Networks Inside Neural Networks
Chuan Guo
Ruihan Wu
Kilian Q. Weinberger
12
5
0
24 Feb 2020
Interpretable Crowd Flow Prediction with Spatial-Temporal Self-Attention
Interpretable Crowd Flow Prediction with Spatial-Temporal Self-Attention
Haoxing Lin
Weijia Jia
Yongjian You
Yiping Sun
AI4TS
35
6
0
22 Feb 2020
Learning to Simulate Complex Physics with Graph Networks
Learning to Simulate Complex Physics with Graph Networks
Alvaro Sanchez-Gonzalez
Jonathan Godwin
Tobias Pfaff
Rex Ying
J. Leskovec
Peter W. Battaglia
PINN
AI4CE
70
1,062
0
21 Feb 2020
Addressing Some Limitations of Transformers with Feedback Memory
Addressing Some Limitations of Transformers with Feedback Memory
Angela Fan
Thibaut Lavril
Edouard Grave
Armand Joulin
Sainbayar Sukhbaatar
33
11
0
21 Feb 2020
Transformer Hawkes Process
Transformer Hawkes Process
Simiao Zuo
Haoming Jiang
Zichong Li
T. Zhao
H. Zha
AI4TS
43
289
0
21 Feb 2020
AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent
  Videos with Deep Learning
AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning
Sanchita Ghose
John J. Prevost
VGen
27
46
0
21 Feb 2020
Learning Dynamic Belief Graphs to Generalize on Text-Based Games
Learning Dynamic Belief Graphs to Generalize on Text-Based Games
Ashutosh Adhikari
Xingdi Yuan
Marc-Alexandre Côté
M. Zelinka
Marc-Antoine Rondeau
Romain Laroche
Pascal Poupart
Jian Tang
Adam Trischler
William L. Hamilton
AI4CE
40
81
0
21 Feb 2020
Adapted Center and Scale Prediction: More Stable and More Accurate
Adapted Center and Scale Prediction: More Stable and More Accurate
Wenhao Wang
28
24
0
20 Feb 2020
Wavesplit: End-to-End Speech Separation by Speaker Clustering
Wavesplit: End-to-End Speech Separation by Speaker Clustering
Neil Zeghidour
David Grangier
VLM
51
263
0
20 Feb 2020
A Novel Framework for Selection of GANs for an Application
A Novel Framework for Selection of GANs for an Application
Tanya Motwani
Manojkumar Somabhai Parmar
32
8
0
20 Feb 2020
Non-Autoregressive Dialog State Tracking
Non-Autoregressive Dialog State Tracking
Hung Le
R. Socher
Guosheng Lin
45
52
0
19 Feb 2020
A Survey of Deep Learning Techniques for Neural Machine Translation
A Survey of Deep Learning Techniques for Neural Machine Translation
Shu Yang
Yuxin Wang
Xiaowen Chu
VLM
AI4TS
AI4CE
38
138
0
18 Feb 2020
A New Clustering neural network for Chinese word segmentation
A New Clustering neural network for Chinese word segmentation
Yuze Zhao
8
0
0
18 Feb 2020
Low-Rank Bottleneck in Multi-head Attention Models
Low-Rank Bottleneck in Multi-head Attention Models
Srinadh Bhojanapalli
Chulhee Yun
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
24
94
0
17 Feb 2020
Multi-layer Representation Fusion for Neural Machine Translation
Multi-layer Representation Fusion for Neural Machine Translation
Qiang Wang
Fuxue Li
Tong Xiao
Yanyang Li
Yinqiao Li
Jingbo Zhu
AI4CE
33
52
0
16 Feb 2020
Neural Machine Translation with Joint Representation
Neural Machine Translation with Joint Representation
Yanyang Li
Qiang Wang
Tong Xiao
Tongran Liu
Jingbo Zhu
4
9
0
16 Feb 2020
Transformer on a Diet
Transformer on a Diet
Chenguang Wang
Zihao Ye
Aston Zhang
Zheng Zhang
Alex Smola
32
8
0
14 Feb 2020
Towards an Appropriate Query, Key, and Value Computation for Knowledge
  Tracing
Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing
Youngduck Choi
Youngnam Lee
Junghyun Cho
Jineon Baek
Byungsoo Kim
Yeongmin Cha
Dongmin Shin
Chan Bae
Jaewe Heo
14
198
0
14 Feb 2020
Cross-Iteration Batch Normalization
Cross-Iteration Batch Normalization
Zhuliang Yao
Yu Cao
Shuxin Zheng
Gao Huang
Stephen Lin
19
85
0
13 Feb 2020
Deep Learning for Source Code Modeling and Generation: Models,
  Applications and Challenges
Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges
T. H. Le
Hao Chen
Muhammad Ali Babar
VLM
70
153
0
13 Feb 2020
Keyphrase Extraction with Span-based Feature Representations
Keyphrase Extraction with Span-based Feature Representations
Funan Mu
Zhenting Yu
Lifeng Wang
Yequan Wang
Qingyu Yin
Yibo Sun
Liqun Liu
Teng Ma
Jing Tang
Xing Zhou
53
17
0
13 Feb 2020
Regularizing activations in neural networks via distribution matching
  with the Wasserstein metric
Regularizing activations in neural networks via distribution matching with the Wasserstein metric
Taejong Joo
Donggu Kang
Byunghoon Kim
40
8
0
13 Feb 2020
Previous
123...939495...109110111
Next