Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1607.06450
Cited By
Layer Normalization
21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Layer Normalization"
50 / 5,529 papers shown
Title
Adaptive Transformers in RL
Shakti Kumar
Jerrod Parker
Panteha Naderian
OffRL
AI4CE
14
13
0
08 Apr 2020
Re-translation versus Streaming for Simultaneous Translation
N. Arivazhagan
Colin Cherry
Wolfgang Macherey
George F. Foster
39
63
0
07 Apr 2020
Neural Analogical Matching
Mayank Agarwal
Constantine Nakos
Ibrahim Abdelaziz
Kenneth D. Forbus
NAI
HAI
21
14
0
07 Apr 2020
Efficient Context and Schema Fusion Networks for Multi-Domain Dialogue State Tracking
Su Zhu
Jieyu Li
Lu Chen
Kai Yu
46
57
0
07 Apr 2020
How Do You Act? An Empirical Study to Understand Behavior of Deep Reinforcement Learning Agents
Richard Meyes
Moritz Schneider
Tobias Meisen
31
2
0
07 Apr 2020
RYANSQL: Recursively Applying Sketch-based Slot Fillings for Complex Text-to-SQL in Cross-Domain Databases
Donghyun Choi
M. Shin
EungGyun Kim
Dong Ryeol Shin
38
123
0
07 Apr 2020
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Zhiqing Sun
Hongkun Yu
Xiaodan Song
Renjie Liu
Yiming Yang
Denny Zhou
MQ
59
801
0
06 Apr 2020
Evolving Normalization-Activation Layers
Hanxiao Liu
Andrew Brock
Karen Simonyan
Quoc V. Le
55
79
0
06 Apr 2020
Rethinking Spatially-Adaptive Normalization
Zhentao Tan
Dongdong Chen
Qi Chu
Menglei Chai
Jing Liao
Mingming He
Lu Yuan
Nenghai Yu
39
12
0
06 Apr 2020
At Which Level Should We Extract? An Empirical Analysis on Extractive Document Summarization
Qingyu Zhou
Furu Wei
Ming Zhou
32
21
0
06 Apr 2020
Residual Shuffle-Exchange Networks for Fast Processing of Long Sequences
Andis Draguns
Emīls Ozoliņš
A. Sostaks
Matiss Apinis
Kārlis Freivalds
19
8
0
06 Apr 2020
Bootstrapping a Crosslingual Semantic Parser
Tom Sherborne
Yumo Xu
Mirella Lapata
46
25
0
06 Apr 2020
TraDE: Transformers for Density Estimation
Rasool Fakoor
Pratik Chaudhari
Jonas W. Mueller
Alex Smola
47
30
0
06 Apr 2020
CG-BERT: Conditional Text Generation with BERT for Generalized Few-shot Intent Detection
Congying Xia
Chenwei Zhang
Hoang Nguyen
Jiawei Zhang
Philip Yu
14
43
0
04 Apr 2020
Pre-training for Abstractive Document Summarization by Reinstating Source Text
Yanyan Zou
Xingxing Zhang
Wei Lu
Furu Wei
Ming Zhou
36
1
0
04 Apr 2020
Gradient Centralization: A New Optimization Technique for Deep Neural Networks
Hongwei Yong
Jianqiang Huang
Xiansheng Hua
Lei Zhang
ODL
32
184
0
03 Apr 2020
Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Transformers
Zhicheng Huang
Zhaoyang Zeng
Bei Liu
Dongmei Fu
Jianlong Fu
ViT
69
437
0
02 Apr 2020
Improved RawNet with Feature Map Scaling for Text-independent Speaker Verification using Raw Waveforms
Jee-weon Jung
Seung-bin Kim
Hye-jin Shim
Ju-ho Kim
Ha-Jin Yu
26
60
0
01 Apr 2020
Sample Efficient Ensemble Learning with Catalyst.RL
Sergey Kolesnikov
Valentin Khrulkov
20
4
0
29 Mar 2020
GPS-Net: Graph Property Sensing Network for Scene Graph Generation
Xin Lin
Changxing Ding
Jinquan Zeng
Dacheng Tao
70
278
0
29 Mar 2020
Serialized Output Training for End-to-End Overlapped Speech Recognition
Naoyuki Kanda
Yashesh Gaur
Xiaofei Wang
Zhong Meng
Takuya Yoshioka
32
113
0
28 Mar 2020
An Investigation into the Stochasticity of Batch Whitening
Lei Huang
Lei Zhao
Yi Zhou
Fan Zhu
Li Liu
Ling Shao
50
18
0
27 Mar 2020
Improved Techniques for Training Single-Image GANs
Tobias Hinz
Matthew Fisher
Oliver Wang
S. Wermter
GAN
VLM
20
144
0
25 Mar 2020
Deep Reinforcement Learning with Robust and Smooth Policy
Qianli Shen
Yuante Li
Haoming Jiang
Zhaoran Wang
T. Zhao
OOD
36
5
0
21 Mar 2020
Cross-Shape Attention for Part Segmentation of 3D Point Clouds
Marios Loizou
Siddhant Garg
Dmitry Petrov
Melinos Averkiou
E. Kalogerakis
3DPC
32
2
0
20 Mar 2020
Normalized and Geometry-Aware Self-Attention Network for Image Captioning
Longteng Guo
Jing Liu
Xinxin Zhu
Peng Yao
Shichen Lu
Hanqing Lu
ViT
137
189
0
19 Mar 2020
Exemplar Normalization for Learning Deep Representation
Ruimao Zhang
Zhanglin Peng
Lingyun Wu
Zhuguo Li
Ping Luo
OOD
57
13
0
19 Mar 2020
Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination
Pratul P. Srinivasan
B. Mildenhall
Matthew Tancik
Jonathan T. Barron
Richard Tucker
Noah Snavely
3DV
38
94
0
18 Mar 2020
Scene Text Recognition via Transformer
Xinjie Feng
Huanjin Yao
Yuankai Qi
Jun Zhang
Shengping Zhang
ViT
33
9
0
18 Mar 2020
Boosting Unconstrained Face Recognition with Auxiliary Unlabeled Data
Yichun Shi
Anil K. Jain
CVBM
61
1
0
17 Mar 2020
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
24
16
0
17 Mar 2020
Multi-modal Dense Video Captioning
Vladimir E. Iashin
Esa Rahtu
27
165
0
17 Mar 2020
Geometric Approaches to Increase the Expressivity of Deep Neural Networks for MR Reconstruction
Eunju Cha
Gyutaek Oh
J. C. Ye
37
11
0
17 Mar 2020
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng Huang
Peng Xu
Davis Liang
Ajay K. Mishra
Bing Xiang
15
31
0
16 Mar 2020
GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modeling
Yahui Liu
Marco De Nadai
Jian Yao
N. Sebe
Bruno Lepri
Xavier Alameda-Pineda
29
25
0
15 Mar 2020
Invariant Causal Prediction for Block MDPs
Amy Zhang
Clare Lyle
Shagun Sodhani
Angelos Filos
Marta Z. Kwiatkowska
Joelle Pineau
Y. Gal
Doina Precup
OffRL
AI4CE
OOD
43
139
0
12 Mar 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
266
585
0
12 Mar 2020
Extended Batch Normalization
Chunjie Luo
Jianfeng Zhan
Lei Wang
Wanling Gao
42
14
0
12 Mar 2020
How Powerful Are Randomly Initialized Pointcloud Set Functions?
Aditya Sanghi
P. Jayaraman
3DPC
25
3
0
11 Mar 2020
ReZero is All You Need: Fast Convergence at Large Depth
Thomas C. Bachlechner
Bodhisattwa Prasad Majumder
H. H. Mao
G. Cottrell
Julian McAuley
AI4CE
35
276
0
10 Mar 2020
Learning to Respond with Stickers: A Framework of Unifying Multi-Modality in Multi-Turn Dialog
Shen Gao
Preslav Nakov
Chang Liu
Li Liu
Dongyan Zhao
Rui Yan
34
32
0
10 Mar 2020
Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction
Yan Xiao
Yaochu Jin
Ran Cheng
K. Hao
12
31
0
10 Mar 2020
Communication-Efficient Distributed Deep Learning: A Comprehensive Survey
Zhenheng Tang
Shaoshuai Shi
Wei Wang
Yue Liu
Xiaowen Chu
31
48
0
10 Mar 2020
Cross-Modal Food Retrieval: Learning a Joint Embedding of Food Images and Recipes with Semantic Consistency and Attention Mechanism
Hao Wang
Doyen Sahoo
Chenghao Liu
Ke Shu
Palakorn Achananuparp
Ee-Peng Lim
Guosheng Lin
70
46
0
09 Mar 2020
ProGen: Language Modeling for Protein Generation
Ali Madani
Bryan McCann
Nikhil Naik
N. Keskar
N. Anand
Raphael R. Eguchi
Po-Ssu Huang
R. Socher
34
276
0
08 Mar 2020
Synaptic Metaplasticity in Binarized Neural Networks
Axel Laborieux
M. Ernoult
T. Hirtzlin
D. Querlioz
CLL
31
62
0
07 Mar 2020
TTPP: Temporal Transformer with Progressive Prediction for Efficient Action Anticipation
Wen Wang
Xiaojiang Peng
Yanzhou Su
Yu Qiao
Jian Cheng
AI4TS
25
18
0
07 Mar 2020
TaskNorm: Rethinking Batch Normalization for Meta-Learning
J. Bronskill
Jonathan Gordon
James Requeima
Sebastian Nowozin
Richard Turner
73
89
0
06 Mar 2020
Teaching Temporal Logics to Neural Networks
Christopher Hahn
Frederik Schmitt
Jens U. Kreber
M. Rabe
Bernd Finkbeiner
NAI
40
66
0
06 Mar 2020
Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding
Seonguk Park
Gyubok Lee
Manoj Bhat
Jimin Seo
Minseok Kang
Jonathan M Francis
Ashwin R. Jadhav
Paul Pu Liang
Louis-Philippe Morency
138
119
0
06 Mar 2020
Previous
1
2
3
...
92
93
94
...
109
110
111
Next