ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,520 papers shown
Title
Multimodal Matching Transformer for Live Commenting
Multimodal Matching Transformer for Live Commenting
Chaoqun Duan
Lei Cui
Shuming Ma
Furu Wei
Conghui Zhu
Tiejun Zhao
14
12
0
07 Feb 2020
Transformer Transducer: A Streamable Speech Recognition Model with
  Transformer Encoders and RNN-T Loss
Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss
Qian Zhang
Han Lu
Hasim Sak
Anshuman Tripathi
Erik McDermott
Stephen Koo
Shankar Kumar
22
474
0
07 Feb 2020
Looking GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with
  Global and Local Attention
Looking GLAMORous: Vehicle Re-Id in Heterogeneous Cameras Networks with Global and Local Attention
Abhijit Suprem
C. Pu
30
28
0
06 Feb 2020
Unbalanced GANs: Pre-training the Generator of Generative Adversarial
  Network using Variational Autoencoder
Unbalanced GANs: Pre-training the Generator of Generative Adversarial Network using Variational Autoencoder
Hyung-Gi Ham
Tae Joon Jun
Daeyoung Kim
DRL
GAN
19
21
0
06 Feb 2020
IART: Intent-aware Response Ranking with Transformers in
  Information-seeking Conversation Systems
IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
Liu Yang
Minghui Qiu
Chen Qu
Cen Chen
Jiafeng Guo
Yongfeng Zhang
W. Bruce Croft
Haiqing Chen
45
38
0
03 Feb 2020
Déjà vu: A Contextualized Temporal Attention Mechanism for
  Sequential Recommendation
Déjà vu: A Contextualized Temporal Attention Mechanism for Sequential Recommendation
Jibang Wu
Renqin Cai
Hongning Wang
HAI
AI4TS
6
63
0
29 Jan 2020
MEMO: A Deep Network for Flexible Combination of Episodic Memories
MEMO: A Deep Network for Flexible Combination of Episodic Memories
Andrea Banino
Adria Puigdomenech Badia
Raphael Köster
Martin Chadwick
V. Zambaldi
Demis Hassabis
Caswell Barry
M. Botvinick
D. Kumaran
Charles Blundell
KELM
31
33
0
29 Jan 2020
Scaling Up Online Speech Recognition Using ConvNets
Scaling Up Online Speech Recognition Using ConvNets
Vineel Pratap
Qiantong Xu
Jacob Kahn
Gilad Avidov
Tatiana Likhomanenko
Awni Y. Hannun
Vitaliy Liptchinsky
Gabriel Synnaeve
R. Collobert
154
38
0
27 Jan 2020
Multimodal Data Fusion based on the Global Workspace Theory
Multimodal Data Fusion based on the Global Workspace Theory
C. Bao
Zafeirios Fountas
Temitayo A. Olugbade
N. Bianchi-Berthouze
35
7
0
26 Jan 2020
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei
Licheng Yu
Tamara L. Berg
Joey Tianyi Zhou
144
277
0
24 Jan 2020
Unsupervised Representation Disentanglement using Cross Domain Features
  and Adversarial Learning in Variational Autoencoder based Voice Conversion
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion
Wen-Chin Huang
Hao Luo
Hsin-Te Hwang
Chen-Chou Lo
Yu-Huai Peng
Yu Tsao
Hsin-Min Wang
DRL
22
42
0
22 Jan 2020
A Comprehensive Study on Temporal Modeling for Online Action Detection
A Comprehensive Study on Temporal Modeling for Online Action Detection
Wen Wang
Xiaojiang Peng
Yu Qiao
Jian Cheng
39
2
0
21 Jan 2020
Towards Stabilizing Batch Statistics in Backward Propagation of Batch
  Normalization
Towards Stabilizing Batch Statistics in Backward Propagation of Batch Normalization
Junjie Yan
Ruosi Wan
Xinming Zhang
Wei Zhang
Yichen Wei
Jian Sun
19
38
0
19 Jan 2020
Data-Driven Permanent Magnet Temperature Estimation in Synchronous
  Motors with Supervised Machine Learning
Data-Driven Permanent Magnet Temperature Estimation in Synchronous Motors with Supervised Machine Learning
Wilhelm Kirchgässner
Oliver Wallscheid
J. Böcker
25
69
0
17 Jan 2020
Cut-Based Graph Learning Networks to Discover Compositional Structure of
  Sequential Video Data
Cut-Based Graph Learning Networks to Discover Compositional Structure of Sequential Video Data
Kyoung-Woon On
Eun-Sol Kim
Y. Heo
Byoung-Tak Zhang
BDL
16
6
0
17 Jan 2020
Delving Deeper into the Decoder for Video Captioning
Delving Deeper into the Decoder for Video Captioning
Haoran Chen
Jianmin Li
Xiaolin Hu
48
34
0
16 Jan 2020
Transformer-based Online CTC/attention End-to-End Speech Recognition
  Architecture
Transformer-based Online CTC/attention End-to-End Speech Recognition Architecture
Haoran Miao
Gaofeng Cheng
Changfeng Gao
Pengyuan Zhang
Yonghong Yan
16
102
0
15 Jan 2020
Reformer: The Efficient Transformer
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
68
2,262
0
13 Jan 2020
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling
  and Recommendation
Parameter-Efficient Transfer from Sequential Behaviors for User Modeling and Recommendation
Fajie Yuan
Xiangnan He
Alexandros Karatzoglou
Liguang Zhang
21
0
0
13 Jan 2020
An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks
  by Unitizing Layers' Outputs
An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks by Unitizing Layers' Outputs
You Huang
Yuanlong Yu
24
6
0
09 Jan 2020
Knowledge-aware Attention Network for Protein-Protein Interaction
  Extraction
Knowledge-aware Attention Network for Protein-Protein Interaction Extraction
Huiwei Zhou
Zhuang Liu
Shixian Ning
Chengkun Lang
Yingyu Lin
Lei Du
22
11
0
07 Jan 2020
Domain Adaptation via Teacher-Student Learning for End-to-End Speech
  Recognition
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition
Zhong Meng
Jinyu Li
Yashesh Gaur
Jiawei Liu
17
50
0
06 Jan 2020
Character-Aware Attention-Based End-to-End Speech Recognition
Character-Aware Attention-Based End-to-End Speech Recognition
Zhong Meng
Yashesh Gaur
Jinyu Li
Jiawei Liu
23
10
0
06 Jan 2020
Unpaired Multi-modal Segmentation via Knowledge Distillation
Unpaired Multi-modal Segmentation via Knowledge Distillation
Qi Dou
Quande Liu
Pheng Ann Heng
Ben Glocker
39
173
0
06 Jan 2020
Self-Orthogonality Module: A Network Architecture Plug-in for Learning
  Orthogonal Filters
Self-Orthogonality Module: A Network Architecture Plug-in for Learning Orthogonal Filters
Ziming Zhang
Wenchi Ma
Yuanwei Wu
Guanghui Wang
40
10
0
05 Jan 2020
Deep Learning for Learning Graph Representations
Deep Learning for Learning Graph Representations
Wenwu Zhu
Xin Eric Wang
Peng Cui
GNN
AI4CE
14
22
0
02 Jan 2020
Assessment Modeling: Fundamental Pre-training Tasks for Interactive
  Educational Systems
Assessment Modeling: Fundamental Pre-training Tasks for Interactive Educational Systems
Youngduck Choi
Youngnam Lee
Junghyun Cho
Jineon Baek
Dongmin Shin
...
Seewoo Lee
Youngmin Cha
Chan Bae
Byungsoo Kim
Jaewe Heo
AI4Ed
18
14
0
01 Jan 2020
Deep Attentive Ranking Networks for Learning to Order Sentences
Deep Attentive Ranking Networks for Learning to Order Sentences
Pawan Kumar
Dhanajit Brahma
H. Karnick
Piyush Rai
21
45
0
31 Dec 2019
OneGAN: Simultaneous Unsupervised Learning of Conditional Image
  Generation, Foreground Segmentation, and Fine-Grained Clustering
OneGAN: Simultaneous Unsupervised Learning of Conditional Image Generation, Foreground Segmentation, and Fine-Grained Clustering
Yaniv Benny
Lior Wolf
VLM
GAN
21
48
0
31 Dec 2019
EEG based Continuous Speech Recognition using Transformers
EEG based Continuous Speech Recognition using Transformers
G. Krishna
Co Tran
Mason Carnahan
Ahmed H. Tewfik
22
15
0
31 Dec 2019
Neural ODEs for Image Segmentation with Level Sets
Neural ODEs for Image Segmentation with Level Sets
Rafael Valle
F. Reda
Mohammad Shoeybi
P. LeGresley
Andrew Tao
Bryan Catanzaro
25
8
0
25 Dec 2019
RecVAE: a New Variational Autoencoder for Top-N Recommendations with
  Implicit Feedback
RecVAE: a New Variational Autoencoder for Top-N Recommendations with Implicit Feedback
Ilya Shenbin
Anton M. Alekseev
E. Tutubalina
Valentin Malykh
Sergey I. Nikolenko
BDL
DRL
21
196
0
24 Dec 2019
Axial Attention in Multidimensional Transformers
Axial Attention in Multidimensional Transformers
Jonathan Ho
Nal Kalchbrenner
Dirk Weissenborn
Tim Salimans
36
520
0
20 Dec 2019
ET-USB: Transformer-Based Sequential Behavior Modeling for Inbound
  Customer Service
ET-USB: Transformer-Based Sequential Behavior Modeling for Inbound Customer Service
Ta-Chun Su
Guan-Ying Chen
20
0
0
20 Dec 2019
Group-Connected Multilayer Perceptron Networks
Group-Connected Multilayer Perceptron Networks
Mohammad Kachuee
Sajad Darabi
Shayan Fazeli
Majid Sarrafzadeh
AI4CE
22
1
0
20 Dec 2019
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series
  Forecasting
Temporal Fusion Transformers for Interpretable Multi-horizon Time Series Forecasting
Bryan Lim
Sercan O. Arik
Nicolas Loeff
Tomas Pfister
AI4TS
71
1,417
0
19 Dec 2019
Optimization for deep learning: theory and algorithms
Optimization for deep learning: theory and algorithms
Ruoyu Sun
ODL
52
168
0
19 Dec 2019
Relational Mimic for Visual Adversarial Imitation Learning
Relational Mimic for Visual Adversarial Imitation Learning
Lionel Blondé
Yichuan Tang
Jian Zhang
Russ Webb
36
0
0
18 Dec 2019
Synchronous Speech Recognition and Speech-to-Text Translation with
  Interactive Decoding
Synchronous Speech Recognition and Speech-to-Text Translation with Interactive Decoding
Yuchen Liu
Jiajun Zhang
Hao Xiong
Long Zhou
Zhongjun He
Hua Wu
Haifeng Wang
Chengqing Zong
42
70
0
16 Dec 2019
Efficient Convolutional Neural Networks for Diacritic Restoration
Efficient Convolutional Neural Networks for Diacritic Restoration
Sawsan Alqahtani
Ajay K. Mishra
Mona T. Diab
27
24
0
14 Dec 2019
Spatial-Temporal Self-Attention Network for Flow Prediction
Spatial-Temporal Self-Attention Network for Flow Prediction
Haoxing Lin
Weijia Jia
Yiping Sun
Yongjian You
3DPC
AI4TS
40
8
0
13 Dec 2019
Local Context Normalization: Revisiting Local Normalization
Local Context Normalization: Revisiting Local Normalization
Anthony Ortiz
Caleb Robinson
Dan Morris
O. Fuentes
Christopher Kiekintveld
Mahmudulla Hassan
Nebojsa Jojic
19
25
0
12 Dec 2019
SpecAugment on Large Scale Datasets
SpecAugment on Large Scale Datasets
Daniel S. Park
Yu Zhang
Chung-Cheng Chiu
Youzheng Chen
Yue Liu
William Chan
Quoc V. Le
Yonghui Wu
27
136
0
11 Dec 2019
VideoDG: Generalizing Temporal Relations in Videos to Novel Domains
VideoDG: Generalizing Temporal Relations in Videos to Novel Domains
Zhiyu Yao
Yunbo Wang
Jianmin Wang
Philip S. Yu
Mingsheng Long
OOD
ViT
32
23
0
08 Dec 2019
Connecting Vision and Language with Localized Narratives
Connecting Vision and Language with Localized Narratives
Jordi Pont-Tuset
J. Uijlings
Soravit Changpinyo
Radu Soricut
V. Ferrari
ObjD
36
244
0
06 Dec 2019
Weak Supervision helps Emergence of Word-Object Alignment and improves
  Vision-Language Tasks
Weak Supervision helps Emergence of Word-Object Alignment and improves Vision-Language Tasks
Corentin Kervadec
G. Antipov
M. Baccouche
Christian Wolf
26
15
0
06 Dec 2019
Semantic Mask for Transformer based End-to-End Speech Recognition
Semantic Mask for Transformer based End-to-End Speech Recognition
Chengyi Wang
Yu Wu
Yujiao Du
Jinyu Li
Shujie Liu
Liang Lu
Shuo Ren
Guoli Ye
Sheng Zhao
Ming Zhou
15
51
0
06 Dec 2019
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
39
313
0
04 Dec 2019
StarGAN v2: Diverse Image Synthesis for Multiple Domains
StarGAN v2: Diverse Image Synthesis for Multiple Domains
Yunjey Choi
Youngjung Uh
Jaejun Yoo
Jung-Woo Ha
3DH
64
1,725
0
04 Dec 2019
Acquiring Knowledge from Pre-trained Model to Neural Machine Translation
Acquiring Knowledge from Pre-trained Model to Neural Machine Translation
Rongxiang Weng
Heng Yu
Shujian Huang
Shanbo Cheng
Weihua Luo
30
66
0
04 Dec 2019
Previous
123...949596...109110111
Next