ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,529 papers shown
Title
Adaptive Transformers in RL
Adaptive Transformers in RL
Shakti Kumar
Jerrod Parker
Panteha Naderian
OffRL
AI4CE
14
13
0
08 Apr 2020
Re-translation versus Streaming for Simultaneous Translation
Re-translation versus Streaming for Simultaneous Translation
N. Arivazhagan
Colin Cherry
Wolfgang Macherey
George F. Foster
39
63
0
07 Apr 2020
Neural Analogical Matching
Neural Analogical Matching
Mayank Agarwal
Constantine Nakos
Ibrahim Abdelaziz
Kenneth D. Forbus
NAI
HAI
21
14
0
07 Apr 2020
Efficient Context and Schema Fusion Networks for Multi-Domain Dialogue
  State Tracking
Efficient Context and Schema Fusion Networks for Multi-Domain Dialogue State Tracking
Su Zhu
Jieyu Li
Lu Chen
Kai Yu
46
57
0
07 Apr 2020
How Do You Act? An Empirical Study to Understand Behavior of Deep
  Reinforcement Learning Agents
How Do You Act? An Empirical Study to Understand Behavior of Deep Reinforcement Learning Agents
Richard Meyes
Moritz Schneider
Tobias Meisen
31
2
0
07 Apr 2020
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex
  Text-to-SQL in Cross-Domain Databases
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
Donghyun Choi
M. Shin
EungGyun Kim
Dong Ryeol Shin
38
123
0
07 Apr 2020
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Zhiqing Sun
Hongkun Yu
Xiaodan Song
Renjie Liu
Yiming Yang
Denny Zhou
MQ
59
801
0
06 Apr 2020
Evolving Normalization-Activation Layers
Evolving Normalization-Activation Layers
Hanxiao Liu
Andrew Brock
Karen Simonyan
Quoc V. Le
55
79
0
06 Apr 2020
Rethinking Spatially-Adaptive Normalization
Rethinking Spatially-Adaptive Normalization
Zhentao Tan
Dongdong Chen
Qi Chu
Menglei Chai
Jing Liao
Mingming He
Lu Yuan
Nenghai Yu
39
12
0
06 Apr 2020
At Which Level Should We Extract? An Empirical Analysis on Extractive
  Document Summarization
At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization
Qingyu Zhou
Furu Wei
Ming Zhou
32
21
0
06 Apr 2020
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Andis Draguns
Emīls Ozoliņš
A. Sostaks
Matiss Apinis
Kārlis Freivalds
19
8
0
06 Apr 2020
Bootstrapping a Crosslingual Semantic Parser
Bootstrapping a Crosslingual Semantic Parser
Tom Sherborne
Yumo Xu
Mirella Lapata
46
25
0
06 Apr 2020
TraDE: Transformers for Density Estimation
TraDE: Transformers for Density Estimation
Rasool Fakoor
Pratik Chaudhari
Jonas W. Mueller
Alex Smola
47
30
0
06 Apr 2020
CG-BERT: Conditional Text Generation with BERT for Generalized Few-shot
  Intent Detection
CG-BERT: Conditional Text Generation with BERT for Generalized Few-shot Intent Detection
Congying Xia
Chenwei Zhang
Hoang Nguyen
Jiawei Zhang
Philip Yu
14
43
0
04 Apr 2020
Pre-training for Abstractive Document Summarization by Reinstating
  Source Text
Pre-training for Abstractive Document Summarization by Reinstating Source Text
Yanyan Zou
Xingxing Zhang
Wei Lu
Furu Wei
Ming Zhou
36
1
0
04 Apr 2020
Gradient Centralization: A New Optimization Technique for Deep Neural
  Networks
Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Hongwei Yong
Jianqiang Huang
Xiansheng Hua
Lei Zhang
ODL
32
184
0
03 Apr 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal
  Transformers
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
69
437
0
02 Apr 2020
Improved RawNet with Feature Map Scaling for Text-independent Speaker
  Verification using Raw Waveforms
Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms
Jee-weon Jung
Seung-bin Kim
Hye-jin Shim
Ju-ho Kim
Ha-Jin Yu
26
60
0
01 Apr 2020
Sample Efficient Ensemble Learning with Catalyst.RL
Sample Efficient Ensemble Learning with Catalyst.RL
Sergey Kolesnikov
Valentin Khrulkov
20
4
0
29 Mar 2020
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
Xin Lin
Changxing Ding
Jinquan Zeng
Dacheng Tao
70
278
0
29 Mar 2020
Serialized Output Training for End-to-End Overlapped Speech Recognition
Serialized Output Training for End-to-End Overlapped Speech Recognition
Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka
32
113
0
28 Mar 2020
An Investigation into the Stochasticity of Batch Whitening
An Investigation into the Stochasticity of Batch Whitening
Lei Huang
Lei Zhao
Yi Zhou
Fan Zhu
Li Liu
Ling Shao
50
18
0
27 Mar 2020
Improved Techniques for Training Single-Image GANs
Improved Techniques for Training Single-Image GANs
Tobias Hinz
Matthew Fisher
Oliver Wang
S. Wermter
GAN
VLM
20
144
0
25 Mar 2020
Deep Reinforcement Learning with Robust and Smooth Policy
Deep Reinforcement Learning with Robust and Smooth Policy
Qianli Shen
Yuante Li
Haoming Jiang
Zhaoran Wang
T. Zhao
OOD
36
5
0
21 Mar 2020
Cross-Shape Attention for Part Segmentation of 3D Point Clouds
Cross-Shape Attention for Part Segmentation of 3D Point Clouds
Marios Loizou
Siddhant Garg
Dmitry Petrov
Melinos Averkiou
E. Kalogerakis
3DPC
32
2
0
20 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image
  Captioning
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
137
189
0
19 Mar 2020
Exemplar Normalization for Learning Deep Representation
Exemplar Normalization for Learning Deep Representation
Ruimao Zhang
Zhanglin Peng
Lingyun Wu
Zhuguo Li
Ping Luo
OOD
57
13
0
19 Mar 2020
Lighthouse: Predicting Lighting Volumes for Spatially-Coherent
  Illumination
Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination
Pratul P. Srinivasan
B. Mildenhall
Matthew Tancik
Jonathan T. Barron
Richard Tucker
Noah Snavely
3DV
38
94
0
18 Mar 2020
Scene Text Recognition via Transformer
Xinjie Feng
Huanjin Yao
Yuankai Qi
Jun Zhang
Shengping Zhang
ViT
33
9
0
18 Mar 2020
Boosting Unconstrained Face Recognition with Auxiliary Unlabeled Data
Boosting Unconstrained Face Recognition with Auxiliary Unlabeled Data
Yichun Shi
Anil K. Jain
CVBM
61
1
0
17 Mar 2020
PowerNorm: Rethinking Batch Normalization in Transformers
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
24
16
0
17 Mar 2020
Multi-modal Dense Video Captioning
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
27
165
0
17 Mar 2020
Geometric Approaches to Increase the Expressivity of Deep Neural
  Networks for MR Reconstruction
Geometric Approaches to Increase the Expressivity of Deep Neural Networks for MR Reconstruction
Eunju Cha
Gyutaek Oh
J. C. Ye
37
11
0
17 Mar 2020
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language
  Understanding
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng Huang
Peng Xu
Davis Liang
Ajay K. Mishra
Bing Xiang
15
31
0
16 Mar 2020
GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image
  Translation via Attribute Gaussian Mixture Modeling
GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modeling
Yahui Liu
Marco De Nadai
Jian Yao
N. Sebe
Bruno Lepri
Xavier Alameda-Pineda
29
25
0
15 Mar 2020
Invariant Causal Prediction for Block MDPs
Invariant Causal Prediction for Block MDPs
Amy Zhang
Clare Lyle
Shagun Sodhani
Angelos Filos
Marta Z. Kwiatkowska
Joelle Pineau
Y. Gal
Doina Precup
OffRL
AI4CE
OOD
43
139
0
12 Mar 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
266
585
0
12 Mar 2020
Extended Batch Normalization
Extended Batch Normalization
Chunjie Luo
Jianfeng Zhan
Lei Wang
Wanling Gao
42
14
0
12 Mar 2020
How Powerful Are Randomly Initialized Pointcloud Set Functions?
How Powerful Are Randomly Initialized Pointcloud Set Functions?
Aditya Sanghi
P. Jayaraman
3DPC
25
3
0
11 Mar 2020
ReZero is All You Need: Fast Convergence at Large Depth
ReZero is All You Need: Fast Convergence at Large Depth
Thomas C. Bachlechner
Bodhisattwa Prasad Majumder
H. H. Mao
G. Cottrell
Julian McAuley
AI4CE
35
276
0
10 Mar 2020
Learning to Respond with Stickers: A Framework of Unifying
  Multi-Modality in Multi-Turn Dialog
Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog
Shen Gao
Preslav Nakov
Chang Liu
Li Liu
Dongyan Zhao
Rui Yan
34
32
0
10 Mar 2020
Hybrid Attention-Based Transformer Block Model for Distant Supervision
  Relation Extraction
Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction
Yan Xiao
Yaochu Jin
Ran Cheng
K. Hao
12
31
0
10 Mar 2020
Communication-Efficient Distributed Deep Learning: A Comprehensive
  Survey
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Zhenheng Tang
Shaoshuai Shi
Wei Wang
Yue Liu
Xiaowen Chu
31
48
0
10 Mar 2020
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images
  and Recipes with Semantic Consistency and Attention Mechanism
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism
Hao Wang
Doyen Sahoo
Chenghao Liu
Ke Shu
Palakorn Achananuparp
Ee-Peng Lim
Guosheng Lin
70
46
0
09 Mar 2020
ProGen: Language Modeling for Protein Generation
ProGen: Language Modeling for Protein Generation
Ali Madani
Bryan McCann
Nikhil Naik
N. Keskar
N. Anand
Raphael R. Eguchi
Po-Ssu Huang
R. Socher
34
276
0
08 Mar 2020
Synaptic Metaplasticity in Binarized Neural Networks
Synaptic Metaplasticity in Binarized Neural Networks
Axel Laborieux
M. Ernoult
T. Hirtzlin
D. Querlioz
CLL
31
62
0
07 Mar 2020
TTPP: Temporal Transformer with Progressive Prediction for Efficient
  Action Anticipation
TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation
Wen Wang
Xiaojiang Peng
Yanzhou Su
Yu Qiao
Jian Cheng
AI4TS
25
18
0
07 Mar 2020
TaskNorm: Rethinking Batch Normalization for Meta-Learning
TaskNorm: Rethinking Batch Normalization for Meta-Learning
J. Bronskill
Jonathan Gordon
James Requeima
Sebastian Nowozin
Richard Turner
73
89
0
06 Mar 2020
Teaching Temporal Logics to Neural Networks
Teaching Temporal Logics to Neural Networks
Christopher Hahn
Frederik Schmitt
Jens U. Kreber
M. Rabe
Bernd Finkbeiner
NAI
40
66
0
06 Mar 2020
Diverse and Admissible Trajectory Forecasting through Multimodal Context
  Understanding
Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding
Seonguk Park
Gyubok Lee
Manoj Bhat
Jimin Seo
Minseok Kang
Jonathan M Francis
Ashwin R. Jadhav
Paul Pu Liang
Louis-Philippe Morency
138
119
0
06 Mar 2020
Previous
123...929394...109110111
Next