Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1607.06450
Cited By
Layer Normalization
21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Layer Normalization"
50 / 5,516 papers shown
Title
A Hierarchical Location Prediction Neural Network for Twitter User Geolocation
Binxuan Huang
Kathleen M. Carley
19
45
0
28 Oct 2019
Neural Architecture Evolution in Deep Reinforcement Learning for Continuous Control
Jörg Franke
Gregor Koehler
Noor H. Awad
Frank Hutter
30
7
0
28 Oct 2019
Tensor Programs I: Wide Feedforward or Recurrent Neural Networks of Any Architecture are Gaussian Processes
Greg Yang
33
194
0
28 Oct 2019
HUBERT Untangles BERT to Improve Transfer across NLP Tasks
M. Moradshahi
Hamid Palangi
M. Lam
P. Smolensky
Jianfeng Gao
34
16
0
25 Oct 2019
Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders
Andy T. Liu
Shu-Wen Yang
Po-Han Chi
Po-Chun Hsu
Hung-yi Lee
SSL
50
372
0
25 Oct 2019
Unsupervised Representation Learning with Future Observation Prediction for Speech Emotion Recognition
Zheng Lian
J. Tao
Bin Liu
Jian Huang
SSL
27
17
0
24 Oct 2019
Syntax-Enhanced Self-Attention-Based Semantic Role Labeling
Yue Zhang
Rui Wang
Luo Si
14
20
0
24 Oct 2019
Speech-XLNet: Unsupervised Acoustic Model Pretraining For Self-Attention Networks
Xingcheng Song
Guangsen Wang
Zhiyong Wu
Yiheng Huang
Dan Su
Dong Yu
Helen Meng
SSL
20
49
0
23 Oct 2019
Complex Transformer: A Framework for Modeling Complex-Valued Sequence
Muqiao Yang
Martin Q. Ma
Dongyu Li
Yao-Hung Hubert Tsai
Ruslan Salakhutdinov
ViT
19
37
0
22 Oct 2019
Sequence-to-sequence Singing Synthesis Using the Feed-forward Transformer
Merlijn Blaauw
J. Bonada
27
55
0
22 Oct 2019
Transformer-based Acoustic Modeling for Hybrid Speech Recognition
Yongqiang Wang
Abdel-rahman Mohamed
Duc Le
Chunxi Liu
Alex Xiao
...
Xiaohui Zhang
Frank Zhang
Christian Fuegen
Geoffrey Zweig
M. Seltzer
16
248
0
22 Oct 2019
Discriminative Neural Clustering for Speaker Diarisation
Qiujia Li
Florian Kreyssig
Chao Zhang
P. Woodland
11
44
0
22 Oct 2019
Signal Combination for Language Identification
Shengye Wang
Li Wan
Yang Yu
Ignacio López Moreno
23
12
0
21 Oct 2019
Self-Attentive Document Interaction Networks for Permutation Equivariant Ranking
Rama Kumar Pasumarthi
Xuanhui Wang
Michael Bendersky
Marc Najork
27
17
0
21 Oct 2019
Machine Learning Systems for Highly-Distributed and Rapidly-Growing Data
Kevin Hsieh
SyDa
OOD
16
4
0
18 Oct 2019
Root Mean Square Layer Normalization
Biao Zhang
Rico Sennrich
25
673
0
16 Oct 2019
An Exponential Learning Rate Schedule for Deep Learning
Zhiyuan Li
Sanjeev Arora
20
212
0
16 Oct 2019
Analyzing the Forgetting Problem in the Pretrain-Finetuning of Dialogue Response Models
Tianxing He
Jun Liu
Kyunghyun Cho
Myle Ott
Bing-Quan Liu
James R. Glass
Fuchun Peng
CLL
35
9
0
16 Oct 2019
Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation
Yi Luo
Zhuo Chen
Takuya Yoshioka
AI4TS
45
762
0
14 Oct 2019
Transformers without Tears: Improving the Normalization of Self-Attention
Toan Q. Nguyen
Julian Salazar
50
225
0
14 Oct 2019
Stabilizing Transformers for Reinforcement Learning
Emilio Parisotto
H. F. Song
Jack W. Rae
Razvan Pascanu
Çağlar Gülçehre
...
Aidan Clark
Seb Noury
M. Botvinick
N. Heess
R. Hadsell
OffRL
24
360
0
13 Oct 2019
Multi-Task Learning for Conversational Question Answering over a Large-Scale Knowledge Base
Tao Shen
Xiubo Geng
Tao Qin
Daya Guo
Duyu Tang
Nan Duan
Guodong Long
Daxin Jiang
33
81
0
11 Oct 2019
Controllable Sentence Simplification: Employing Syntactic and Lexical Constraints
Jonathan Mallinson
Mirella Lapata
27
17
0
10 Oct 2019
Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods
Kevin J. Liang
Guoyin Wang
Yitong Li
Ricardo Henao
Lawrence Carin
35
2
0
09 Oct 2019
Imagined Value Gradients: Model-Based Policy Optimization with Transferable Latent Dynamics Models
Arunkumar Byravan
Jost Tobias Springenberg
A. Abdolmaleki
Roland Hafner
Michael Neunert
Thomas Lampe
Noah Y. Siegel
N. Heess
Martin Riedmiller
OffRL
17
41
0
09 Oct 2019
Improving Generalization in Meta Reinforcement Learning using Learned Objectives
Louis Kirsch
Sjoerd van Steenkiste
Jürgen Schmidhuber
OffRL
16
119
0
09 Oct 2019
Federated Learning of N-gram Language Models
Mingqing Chen
A. Suresh
Rajiv Mathews
Adeline Wong
Cyril Allauzen
F. Beaufays
Michael Riley
FedML
24
74
0
08 Oct 2019
One-To-Many Multilingual End-to-end Speech Translation
Mattia Antonino Di Gangi
Matteo Negri
Marco Turchi
37
50
0
08 Oct 2019
SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability
Marcella Cornia
Lorenzo Baraldi
Rita Cucchiara
14
27
0
07 Oct 2019
Irregular Convolutional Auto-Encoder on Point Clouds
Yuhui Zhang
G. Gutmann
Konagaya Akihiko
3DPC
28
2
0
07 Oct 2019
MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
Ning Lu
Wenwen Yu
Xianbiao Qi
Yihao Chen
Ping Gong
Rong Xiao
Xiang Bai
30
157
0
07 Oct 2019
Neural Language Priors
Joseph Enguehard
Dan Busbridge
V. Zhelezniak
Nils Y. Hammerla
31
3
0
04 Oct 2019
SNDCNN: Self-normalizing deep CNNs with scaled exponential linear units for speech recognition
Atanas G. Atanasov
Tim Ng
Leo Liu
Henry Mason
Xiaodan Zhuang
Daben Liu
23
40
0
04 Oct 2019
Farkas layers: don't shift the data, fix the geometry
Aram-Alexandre Pooladian
Chris Finlay
Adam M. Oberman
AI4CE
19
1
0
04 Oct 2019
Generating Relevant Counter-Examples from a Positive Unlabeled Dataset for Image Classification
Florent Chiaroni
G. Khodabandelou
Mohamed-Cherif Rahal
N. Hueber
Frederic Dufaux
12
4
0
04 Oct 2019
Towards Understanding of Medical Randomized Controlled Trials by Conclusion Generation
Max Landauer
Yung-Sung Chuang
Florian Skopik
Yun-Nung Chen
FaML
LM&MA
MedIm
20
8
0
03 Oct 2019
Improving Sample Efficiency in Model-Free Reinforcement Learning from Images
Denis Yarats
Amy Zhang
Ilya Kostrikov
Brandon Amos
Joelle Pineau
Rob Fergus
DRL
65
441
0
02 Oct 2019
Learning Maximally Predictive Prototypes in Multiple Instance Learning
Mert Yuksekgonul
Özgur Emre Sivrikaya
M. Baydogan
SSL
11
0
0
02 Oct 2019
Exploiting BERT for End-to-End Aspect-based Sentiment Analysis
Xin Li
Lidong Bing
Wenxuan Zhang
W. Lam
36
278
0
02 Oct 2019
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Kyu Jeong Han
R. Prieto
Kaixing(Kai) Wu
T. Ma
18
69
0
01 Oct 2019
Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping
Cristian Bodnar
A. Li
Karol Hausman
P. Pastor
Mrinal Kalakrishnan
OffRL
28
50
0
01 Oct 2019
Improved Word Sense Disambiguation Using Pre-Trained Contextualized Word Representations
Christian Hadiwinoto
Hwee Tou Ng
Wee Chung Gan
30
83
0
01 Oct 2019
The Non-IID Data Quagmire of Decentralized Machine Learning
Kevin Hsieh
Amar Phanishayee
O. Mutlu
Phillip B. Gibbons
18
558
0
01 Oct 2019
Contextual Graph Attention for Answering Logical Queries over Incomplete Knowledge Graphs
Gengchen Mai
K. Janowicz
Bo Yan
Rui Zhu
Ling Cai
Ni Lao
17
15
0
30 Sep 2019
Stabilizing Generative Adversarial Networks: A Survey
Maciej Wiatrak
Stefano V. Albrecht
A. Nystrom
GAN
29
84
0
30 Sep 2019
AdaptivFloat: A Floating-point based Data Type for Resilient Deep Learning Inference
Thierry Tambe
En-Yu Yang
Zishen Wan
Yuntian Deng
Vijay Janapa Reddi
Alexander M. Rush
David Brooks
Gu-Yeon Wei
MQ
19
21
0
29 Sep 2019
Improving RNN Transducer Modeling for End-to-End Speech Recognition
Jinyu Li
Rui Zhao
Hu Hu
Jiawei Liu
19
170
0
26 Sep 2019
Optimizing Speech Recognition For The Edge
Yuan Shangguan
Jian Li
Qiao Liang
R. Álvarez
Ian McGraw
28
64
0
26 Sep 2019
PairNorm: Tackling Oversmoothing in GNNs
Lingxiao Zhao
Leman Akoglu
16
505
0
26 Sep 2019
Scaling data-driven robotics with reward sketching and batch reinforcement learning
Serkan Cabi
Sergio Gomez Colmenarejo
Alexander Novikov
Ksenia Konyushkova
Scott E. Reed
...
David Barker
Jonathan Scholz
Misha Denil
Nando de Freitas
Ziyun Wang
OffRL
33
29
0
26 Sep 2019
Previous
1
2
3
...
96
97
98
...
109
110
111
Next