ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,515 papers shown
Title
Amortized Bethe Free Energy Minimization for Learning MRFs
Amortized Bethe Free Energy Minimization for Learning MRFs
Sam Wiseman
Yoon Kim
TPM
DRL
21
11
0
14 Jun 2019
Lattice Transformer for Speech Translation
Lattice Transformer for Speech Translation
Pei Zhang
Boxing Chen
Niyu Ge
Kai Fan
39
48
0
13 Jun 2019
COMET: Commonsense Transformers for Automatic Knowledge Graph
  Construction
COMET: Commonsense Transformers for Automatic Knowledge Graph Construction
Antoine Bosselut
Hannah Rashkin
Maarten Sap
Chaitanya Malaviya
Asli Celikyilmaz
Yejin Choi
20
903
0
12 Jun 2019
Keeping Notes: Conditional Natural Language Generation with a Scratchpad
  Mechanism
Keeping Notes: Conditional Natural Language Generation with a Scratchpad Mechanism
Ryan Y. Benmalek
Madian Khabsa
Suma Desu
Claire Cardie
Michele Banko
30
6
0
12 Jun 2019
Monotonic Infinite Lookback Attention for Simultaneous Machine
  Translation
Monotonic Infinite Lookback Attention for Simultaneous Machine Translation
N. Arivazhagan
Colin Cherry
Wolfgang Macherey
Chung-Cheng Chiu
Semih Yavuz
Ruoming Pang
Wei Li
Colin Raffel
CLL
19
190
0
12 Jun 2019
Learning the Graphical Structure of Electronic Health Records with Graph
  Convolutional Transformer
Learning the Graphical Structure of Electronic Health Records with Graph Convolutional Transformer
Edward Choi
Zhen Xu
Yujia Li
Michael W. Dusenberry
Gerardo Flores
Yuan Xue
Andrew M. Dai
MedIm
24
238
0
11 Jun 2019
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval
Yale Song
M. Soleymani
19
242
0
11 Jun 2019
A Document-grounded Matching Network for Response Selection in
  Retrieval-based Chatbots
A Document-grounded Matching Network for Response Selection in Retrieval-based Chatbots
Xueliang Zhao
Chongyang Tao
Wei Wu
Can Xu
Dongyan Zhao
Rui Yan
23
41
0
11 Jun 2019
Improving Neural Language Modeling via Adversarial Training
Improving Neural Language Modeling via Adversarial Training
Dilin Wang
Chengyue Gong
Qiang Liu
AAML
50
115
0
10 Jun 2019
Matching the Blanks: Distributional Similarity for Relation Learning
Matching the Blanks: Distributional Similarity for Relation Learning
Livio Baldini Soares
Nicholas FitzGerald
Jeffrey Ling
Tom Kwiatkowski
9
765
0
07 Jun 2019
Learning Adaptive Classifiers Synthesis for Generalized Few-Shot
  Learning
Learning Adaptive Classifiers Synthesis for Generalized Few-Shot Learning
Han-Jia Ye
Hexiang Hu
De-Chuan Zhan
30
59
0
07 Jun 2019
The Normalization Method for Alleviating Pathological Sharpness in Wide
  Neural Networks
The Normalization Method for Alleviating Pathological Sharpness in Wide Neural Networks
Ryo Karakida
S. Akaho
S. Amari
27
40
0
07 Jun 2019
Understanding and Improving Transformer From a Multi-Particle Dynamic
  System Point of View
Understanding and Improving Transformer From a Multi-Particle Dynamic System Point of View
Yiping Lu
Zhuohan Li
Di He
Zhiqing Sun
Bin Dong
Tao Qin
Liwei Wang
Tie-Yan Liu
AI4CE
24
168
0
06 Jun 2019
Towards Lossless Encoding of Sentences
Towards Lossless Encoding of Sentences
Gabriele Prato
Mathieu Duchesneau
A. Chandar
Alain Tapp
20
2
0
04 Jun 2019
Scene Representation Networks: Continuous 3D-Structure-Aware Neural
  Scene Representations
Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations
Vincent Sitzmann
Michael Zollhoefer
Gordon Wetzstein
3DPC
3DV
104
1,269
0
04 Jun 2019
Self-Attentional Models for Lattice Inputs
Self-Attentional Models for Lattice Inputs
Matthias Sperber
Graham Neubig
Ngoc-Quan Pham
A. Waibel
23
42
0
04 Jun 2019
Lattice-Based Transformer Encoder for Neural Machine Translation
Lattice-Based Transformer Encoder for Neural Machine Translation
Fengshun Xiao
Jiangtong Li
Zhao Hai
Rui Wang
Kehai Chen
34
42
0
04 Jun 2019
NodeDrop: A Condition for Reducing Network Size without Effect on Output
NodeDrop: A Condition for Reducing Network Size without Effect on Output
Louis Jensen
Jacob A. Harer
S. Chin
13
0
0
03 Jun 2019
Fashion Editing with Adversarial Parsing Learning
Fashion Editing with Adversarial Parsing Learning
Haoye Dong
Xiaodan Liang
Yixuan Zhang
Xujie Zhang
Zhenyu Xie
Bowen Wu
Ziqi Zhang
Xiaohui Shen
Jian Yin
GAN
28
11
0
03 Jun 2019
Efficient 8-Bit Quantization of Transformer Neural Machine Language
  Translation Model
Efficient 8-Bit Quantization of Transformer Neural Machine Language Translation Model
Aishwarya Bhandare
Vamsi Sripathi
Deepthi Karkada
Vivek V. Menon
Sun Choi
Kushal Datta
V. Saletore
MQ
30
131
0
03 Jun 2019
Multimodal Transformer for Unaligned Multimodal Language Sequences
Multimodal Transformer for Unaligned Multimodal Language Sequences
Yao-Hung Hubert Tsai
Shaojie Bai
Paul Pu Liang
J. Zico Kolter
Louis-Philippe Morency
Ruslan Salakhutdinov
32
1,260
0
01 Jun 2019
Attentional Policies for Cross-Context Multi-Agent Reinforcement
  Learning
Attentional Policies for Cross-Context Multi-Agent Reinforcement Learning
Matthew A. Wright
R. Horowitz
17
3
0
31 May 2019
A Lightweight Recurrent Network for Sequence Modeling
A Lightweight Recurrent Network for Sequence Modeling
Biao Zhang
Rico Sennrich
27
7
0
30 May 2019
Hierarchical Transformers for Multi-Document Summarization
Hierarchical Transformers for Multi-Document Summarization
Yang Liu
Mirella Lapata
33
294
0
30 May 2019
DDP-GCN: Multi-Graph Convolutional Network for Spatiotemporal Traffic
  Forecasting
DDP-GCN: Multi-Graph Convolutional Network for Spatiotemporal Traffic Forecasting
Kyungeun Lee
Wonjong Rhee
AI4TS
GNN
28
109
0
29 May 2019
Revisiting Low-Resource Neural Machine Translation: A Case Study
Revisiting Low-Resource Neural Machine Translation: A Case Study
Rico Sennrich
Biao Zhang
21
223
0
28 May 2019
Learning Efficient and Effective Exploration Policies with
  Counterfactual Meta Policy
Learning Efficient and Effective Exploration Policies with Counterfactual Meta Policy
Ruihan Yang
Qiwei Ye
Tie-Yan Liu
30
0
0
28 May 2019
CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
CIF: Continuous Integrate-and-Fire for End-to-End Speech Recognition
Linhao Dong
Bo Xu
27
125
0
27 May 2019
Learning to Route in Similarity Graphs
Learning to Route in Similarity Graphs
Dmitry Baranchuk
Dmitry Persiyanov
A. Sinitsin
Artem Babenko
21
36
0
27 May 2019
AI-GAs: AI-generating algorithms, an alternate paradigm for producing
  general artificial intelligence
AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence
Jeff Clune
21
118
0
27 May 2019
QuesNet: A Unified Representation for Heterogeneous Test Questions
QuesNet: A Unified Representation for Heterogeneous Test Questions
Yu Yin
Qi Liu
Zhenya Huang
Enhong Chen
Wei Tong
Shijin Wang
Yu-Ho Su
11
45
0
27 May 2019
Collaborative Self-Attention for Recommender Systems
Kai-Lang Yao
Wu-Jun Li
30
1
0
27 May 2019
Gated Group Self-Attention for Answer Selection
Gated Group Self-Attention for Answer Selection
Dong Xu
Jianhui Ji
Haikuan Huang
Hongbo Deng
Wu-Jun Li
19
3
0
26 May 2019
From Here to There: Video Inbetweening Using Direct 3D Convolutions
From Here to There: Video Inbetweening Using Direct 3D Convolutions
Yunpeng Li
Dominik Roblek
Marco Tagliasacchi
VGen
22
24
0
24 May 2019
mu-Forcing: Training Variational Recurrent Autoencoders for Text
  Generation
mu-Forcing: Training Variational Recurrent Autoencoders for Text Generation
Dayiheng Liu
Xu Yang
Feng He
Yuanyuan Chen
Jiancheng Lv
DRL
BDL
15
33
0
24 May 2019
CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time
  Series Imputation
CDSA: Cross-Dimensional Self-Attention for Multivariate, Geo-tagged Time Series Imputation
Jiawei Ma
Zheng Shou
Alireza Zareian
Hassan Mansour
A. Vetro
Shih-Fu Chang
AI4TS
31
61
0
23 May 2019
Countering Noisy Labels By Learning From Auxiliary Clean Labels
Countering Noisy Labels By Learning From Auxiliary Clean Labels
Tsung Wei Tsai
Chongxuan Li
Jun Zhu
SSL
10
1
0
23 May 2019
Data-Efficient Image Recognition with Contrastive Predictive Coding
Data-Efficient Image Recognition with Contrastive Predictive Coding
Olivier J. Hénaff
A. Srinivas
J. Fauw
Ali Razavi
Carl Doersch
S. M. Ali Eslami
Aaron van den Oord
SSL
60
1,417
0
22 May 2019
A Seq-to-Seq Transformer Premised Temporal Convolutional Network for
  Chinese Word Segmentation
A Seq-to-Seq Transformer Premised Temporal Convolutional Network for Chinese Word Segmentation
Wei Jiang
Yan Tang
15
4
0
21 May 2019
Generating Logical Forms from Graph Representations of Text and Entities
Generating Logical Forms from Graph Representations of Text and Entities
Peter Shaw
Philip Massey
Angelica Chen
Francesco Piccinno
Yasemin Altun
GNN
AI4CE
NAI
35
38
0
21 May 2019
Multimodal Transformer with Multi-View Visual Representation for Image
  Captioning
Multimodal Transformer with Multi-View Visual Representation for Image Captioning
Jun-chen Yu
Jing Li
Zhou Yu
Qingming Huang
ViT
27
377
0
20 May 2019
Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Dueling Decoders: Regularizing Variational Autoencoder Latent Spaces
Bryan Seybold
Emily Fertig
Alexander A. Alemi
Ian S. Fischer
DRL
29
4
0
17 May 2019
Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep
  Feature Spaces
Recurrent Kalman Networks: Factorized Inference in High-Dimensional Deep Feature Spaces
P. Becker
Harit Pandya
Gregor H. W. Gebhardt
Cheng Zhao
James Taylor
Gerhard Neumann
BDL
24
94
0
17 May 2019
Using Photorealistic Face Synthesis and Domain Adaptation to Improve
  Facial Expression Analysis
Using Photorealistic Face Synthesis and Domain Adaptation to Improve Facial Expression Analysis
Behzad Bozorgtabar
Mohammad Saeed Rad
H. K. Ekenel
Jean-Philippe Thiran
CVBM
28
19
0
17 May 2019
Exact-K Recommendation via Maximal Clique Optimization
Exact-K Recommendation via Maximal Clique Optimization
Yu Gong
Yu Zhu
Lu Duan
Qingwen Liu
Ziyu Guan
Fei Sun
Wenwu Ou
Kenny Q. Zhu
OffRL
CML
37
59
0
17 May 2019
Learning from Context: Exploiting and Interpreting File Path Information
  for Better Malware Detection
Learning from Context: Exploiting and Interpreting File Path Information for Better Malware Detection
Adarsh Kyadige
Ethan M. Rudd
Konstantin Berlin
15
7
0
16 May 2019
HIBERT: Document Level Pre-training of Hierarchical Bidirectional
  Transformers for Document Summarization
HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization
Xingxing Zhang
Furu Wei
M. Zhou
37
377
0
16 May 2019
Incorporating Sememes into Chinese Definition Modeling
Incorporating Sememes into Chinese Definition Modeling
Liner Yang
Cunliang Kong
Yun Chen
Yang Liu
Qinan Fan
Erhong Yang
11
30
0
16 May 2019
Meta reinforcement learning as task inference
Meta reinforcement learning as task inference
Jan Humplik
Alexandre Galashov
Leonard Hasenclever
Pedro A. Ortega
Yee Whye Teh
N. Heess
OffRL
41
127
0
15 May 2019
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image
  Representations
Aligning Visual Regions and Textual Concepts for Semantic-Grounded Image Representations
Fenglin Liu
Yuanxin Liu
Xuancheng Ren
Xiaodong He
Xu Sun
VLM
34
81
0
15 May 2019
Previous
123...99100101...109110111
Next