ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,516 papers shown
Title
Reducing Transformer Depth on Demand with Structured Dropout
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
48
585
0
25 Sep 2019
Gated Channel Transformation for Visual Recognition
Gated Channel Transformation for Visual Recognition
Zongxin Yang
Linchao Zhu
Yu Wu
Yezhou Yang
ViT
22
204
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained
  Language Models
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
249
208
0
25 Sep 2019
Learning Propagation for Arbitrarily-structured Data
Learning Propagation for Arbitrarily-structured Data
Sifei Liu
Xueting Li
Varun Jampani
Shalini De Mello
Jan Kautz
GNN
37
0
0
25 Sep 2019
Learning Visual Relation Priors for Image-Text Matching and Image
  Captioning with Neural Scene Graph Generators
Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Kuang-Huei Lee
Hamid Palangi
Xi Chen
Houdong Hu
Jianfeng Gao
VLM
30
37
0
22 Sep 2019
Using Chinese Glyphs for Named Entity Recognition
Using Chinese Glyphs for Named Entity Recognition
Arijit Sehanobish
Chan Hee Song
26
22
0
22 Sep 2019
Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role
  Labeling
Graph Convolutions over Constituent Trees for Syntax-Aware Semantic Role Labeling
Diego Marcheggiani
Ivan Titov
GNN
19
39
0
21 Sep 2019
IntersectGAN: Learning Domain Intersection for Generating Images with
  Multiple Attributes
IntersectGAN: Learning Domain Intersection for Generating Images with Multiple Attributes
Zehui Yao
Boyan Zhang
Zhiyong Wang
Wanli Ouyang
Dong Xu
Dagan Feng
GAN
CVBM
9
3
0
21 Sep 2019
From feature selection to continuous optimization
From feature selection to continuous optimization
H. Rakhshani
L. Idoumghar
Julien Lepagnot
Mathieu Brévilliers
17
1
0
20 Sep 2019
Simple, Scalable Adaptation for Neural Machine Translation
Simple, Scalable Adaptation for Neural Machine Translation
Ankur Bapna
N. Arivazhagan
Orhan Firat
AI4CE
58
408
0
18 Sep 2019
Impact of novel aggregation methods for flexible, time-sensitive EHR
  prediction without variable selection or cleaning
Impact of novel aggregation methods for flexible, time-sensitive EHR prediction without variable selection or cleaning
J. Deasy
A. Ercole
Pietro Lio
OOD
34
2
0
17 Sep 2019
Emergent Tool Use From Multi-Agent Autocurricula
Emergent Tool Use From Multi-Agent Autocurricula
Bowen Baker
I. Kanitscheider
Todor Markov
Yi Wu
Glenn Powell
Bob McGrew
Igor Mordatch
LRM
54
647
0
17 Sep 2019
Hint-Based Training for Non-Autoregressive Machine Translation
Hint-Based Training for Non-Autoregressive Machine Translation
Zhuohan Li
Zi Lin
Di He
Fei Tian
Tao Qin
Liwei Wang
Tie-Yan Liu
31
72
0
15 Sep 2019
Spatiotemporal Attention Networks for Wind Power Forecasting
Spatiotemporal Attention Networks for Wind Power Forecasting
Xingbo Fu
F. Gao
Jiang Wu
Xinyu Wei
Fangwei Duan
AI4TS
14
32
0
14 Sep 2019
End-to-End Neural Speaker Diarization with Self-attention
End-to-End Neural Speaker Diarization with Self-attention
Yusuke Fujita
Naoyuki Kanda
Shota Horiguchi
Yawen Xue
Kenji Nagamatsu
Shinji Watanabe
190
238
0
13 Sep 2019
CTRL: A Conditional Transformer Language Model for Controllable
  Generation
CTRL: A Conditional Transformer Language Model for Controllable Generation
N. Keskar
Bryan McCann
Lav Varshney
Caiming Xiong
R. Socher
AI4CE
62
1,237
0
11 Sep 2019
Dependency-Aware Named Entity Recognition with Relative and Global
  Attentions
Dependency-Aware Named Entity Recognition with Relative and Global Attentions
Gustavo Aguilar
Thamar Solorio
29
9
0
11 Sep 2019
Global Locality in Biomedical Relation and Event Extraction
Global Locality in Biomedical Relation and Event Extraction
Elaheh Shafieibavani
Antonio Jimeno Yepes
Xu Zhong
David Martínez
12
4
0
11 Sep 2019
Core Semantic First: A Top-down Approach for AMR Parsing
Core Semantic First: A Top-down Approach for AMR Parsing
Deng Cai
W. Lam
GNN
32
53
0
10 Sep 2019
Scene Recognition with Prototype-agnostic Scene Layout
Scene Recognition with Prototype-agnostic Scene Layout
Gongwei Chen
Xinhang Song
Haitao Zeng
Shuqiang Jiang
30
56
0
07 Sep 2019
Linear Context Transform Block
Linear Context Transform Block
D. Ruan
Jun Wen
Nenggan Zheng
Min Zheng
ViT
19
22
0
06 Sep 2019
Help, Anna! Visual Navigation with Natural Multimodal Assistance via
  Retrospective Curiosity-Encouraging Imitation Learning
Help, Anna! Visual Navigation with Natural Multimodal Assistance via Retrospective Curiosity-Encouraging Imitation Learning
Khanh Nguyen
Hal Daumé
LM&Ro
EgoV
183
150
0
04 Sep 2019
Deep Equilibrium Models
Deep Equilibrium Models
Shaojie Bai
J. Zico Kolter
V. Koltun
14
657
0
03 Sep 2019
Logic and the $2$-Simplicial Transformer
Logic and the 222-Simplicial Transformer
James Clift
D. Doryn
Daniel Murfet
James Wallbridge
NAI
21
3
0
02 Sep 2019
Enriching Medcial Terminology Knowledge Bases via Pre-trained Language
  Model and Graph Convolutional Network
Enriching Medcial Terminology Knowledge Bases via Pre-trained Language Model and Graph Convolutional Network
Jiaying Zhang
Zhixing Zhang
Huanhuan Zhang
Zhiyuan Ma
Yangming Zhou
Ping He
MedIm
16
0
0
02 Sep 2019
Subword Language Model for Query Auto-Completion
Subword Language Model for Query Auto-Completion
Gyuwan Kim
17
14
0
02 Sep 2019
Global Entity Disambiguation with BERT
Global Entity Disambiguation with BERT
Ikuya Yamada
Koki Washio
Hiroyuki Shindo
Yuji Matsumoto
22
31
0
01 Sep 2019
Repurposing Decoder-Transformer Language Models for Abstractive
  Summarization
Repurposing Decoder-Transformer Language Models for Abstractive Summarization
Luke de Oliveira
Alfredo Láinez Rodrigo
19
4
0
01 Sep 2019
Improving Deep Transformer with Depth-Scaled Initialization and Merged
  Attention
Improving Deep Transformer with Depth-Scaled Initialization and Merged Attention
Biao Zhang
Ivan Titov
Rico Sennrich
16
102
0
29 Aug 2019
Regularized Context Gates on Transformer for Machine Translation
Regularized Context Gates on Transformer for Machine Translation
Xintong Li
Lemao Liu
Rui Wang
Guoping Huang
M. Meng
53
4
0
29 Aug 2019
Key Protected Classification for Collaborative Learning
Key Protected Classification for Collaborative Learning
Mert Bulent Sariyildiz
R. G. Cinbis
Erman Ayday
32
10
0
27 Aug 2019
A deep artificial neural network based model for underlying cause of
  death prediction from death certificates
A deep artificial neural network based model for underlying cause of death prediction from death certificates
Louis Falissard
C. Morgand
Sylvie Roussel
Claire Imbaud
Walid Ghosn
Karim Bounebache
G. Rey
25
4
0
26 Aug 2019
Object-Driven Multi-Layer Scene Decomposition From a Single Image
Object-Driven Multi-Layer Scene Decomposition From a Single Image
Helisa Dhamo
Nassir Navab
Federico Tombari
3DV
27
30
0
26 Aug 2019
What are Neural Networks made of?
What are Neural Networks made of?
Rene Schaub
ODL
14
0
0
25 Aug 2019
Towards Unconstrained End-to-End Text Spotting
Towards Unconstrained End-to-End Text Spotting
Siyang Qin
Alessandro Bissacco
Michalis Raptis
Yasuhisa Fujii
Y. Xiao
26
129
0
24 Aug 2019
Reference Network for Neural Machine Translation
Reference Network for Neural Machine Translation
Han Fu
Chenghao Liu
Jianling Sun
14
1
0
23 Aug 2019
Text Summarization with Pretrained Encoders
Text Summarization with Pretrained Encoders
Yang Liu
Mirella Lapata
MILM
263
1,436
0
22 Aug 2019
U-Net Training with Instance-Layer Normalization
U-Net Training with Instance-Layer Normalization
Xiao-Yun Zhou
Peichao Li
Zhao-Yang Wang
Guang-Zhong Yang
34
9
0
21 Aug 2019
Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese
  Medical Text
Fine-tuning BERT for Joint Entity and Relation Extraction in Chinese Medical Text
Kui Xue
Yangming Zhou
Zhiyuan Ma
Tong Ruan
Huanhuan Zhang
Ping He
32
89
0
21 Aug 2019
Attention on Attention for Image Captioning
Attention on Attention for Image Captioning
Lun Huang
Wenmin Wang
Jie Chen
Xiao-Yong Wei
24
824
0
19 Aug 2019
Geometric Disentanglement for Generative Latent Shape Models
Geometric Disentanglement for Generative Latent Shape Models
Tristan Aumentado-Armstrong
Stavros Tsogkas
Allan D. Jepson
Sven J. Dickinson
DRL
30
54
0
18 Aug 2019
A Multi-Type Multi-Span Network for Reading Comprehension that Requires
  Discrete Reasoning
A Multi-Type Multi-Span Network for Reading Comprehension that Requires Discrete Reasoning
Minghao Hu
Yuxing Peng
Zhen Huang
Dongsheng Li
AIMat
LRM
32
91
0
15 Aug 2019
Temporal Collaborative Ranking Via Personalized Transformer
Temporal Collaborative Ranking Via Personalized Transformer
Liwei Wu
Shuqing Li
Cho-Jui Hsieh
James Sharpnack
AI4TS
24
4
0
15 Aug 2019
FlowDelta: Modeling Flow Information Gain in Reasoning for
  Conversational Machine Comprehension
FlowDelta: Modeling Flow Information Gain in Reasoning for Conversational Machine Comprehension
Yi-Ting Yeh
Yun-Nung Chen
14
40
0
14 Aug 2019
Complicated Table Structure Recognition
Complicated Table Structure Recognition
Zewen Chi
Heyan Huang
Heng-Da Xu
Houjin Yu
Wanxuan Yin
Xian-Ling Mao
LMTD
25
107
0
13 Aug 2019
Multimodal Unified Attention Networks for Vision-and-Language
  Interactions
Multimodal Unified Attention Networks for Vision-and-Language Interactions
Zhou Yu
Yuhao Cui
Jun Yu
Dacheng Tao
Q. Tian
27
38
0
12 Aug 2019
On the Variance of the Adaptive Learning Rate and Beyond
On the Variance of the Adaptive Learning Rate and Beyond
Liyuan Liu
Haoming Jiang
Pengcheng He
Weizhu Chen
Xiaodong Liu
Jianfeng Gao
Jiawei Han
ODL
8
1,889
0
08 Aug 2019
Universal Adversarial Audio Perturbations
Universal Adversarial Audio Perturbations
Sajjad Abdoli
L. G. Hafemann
Jérôme Rony
Ismail Ben Ayed
P. Cardinal
Alessandro Lameiras Koerich
AAML
25
51
0
08 Aug 2019
Promoting Coordination through Policy Regularization in Multi-Agent Deep
  Reinforcement Learning
Promoting Coordination through Policy Regularization in Multi-Agent Deep Reinforcement Learning
Julien Roy
Paul Barde
Félix G. Harvey
Derek Nowrouzezahrai
C. Pal
19
3
0
06 Aug 2019
View N-gram Network for 3D Object Retrieval
View N-gram Network for 3D Object Retrieval
Xinwei He
Tengteng Huang
S. Bai
X. Bai
3DPC
26
56
0
06 Aug 2019
Previous
123...979899...109110111
Next