ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1607.06450
  4. Cited By
Layer Normalization

Layer Normalization

21 July 2016
Jimmy Lei Ba
J. Kiros
Geoffrey E. Hinton
ArXivPDFHTML

Papers citing "Layer Normalization"

50 / 5,502 papers shown
Title
Scaling Neural Machine Translation
Scaling Neural Machine Translation
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
47
610
0
01 Jun 2018
The Nonlinearity Coefficient - Predicting Generalization in Deep Neural
  Networks
The Nonlinearity Coefficient - Predicting Generalization in Deep Neural Networks
George Philipp
J. Carbonell
23
14
0
01 Jun 2018
Understanding Batch Normalization
Understanding Batch Normalization
Johan Bjorck
Carla P. Gomes
B. Selman
Kilian Q. Weinberger
21
593
0
01 Jun 2018
How Does Batch Normalization Help Optimization?
How Does Batch Normalization Help Optimization?
Shibani Santurkar
Dimitris Tsipras
Andrew Ilyas
A. Madry
ODL
32
1,522
0
29 May 2018
Bi-Directional Neural Machine Translation with Synthetic Parallel Data
Bi-Directional Neural Machine Translation with Synthetic Parallel Data
Xing Niu
Michael J. Denkowski
Marine Carpuat
SyDa
16
58
0
29 May 2018
Exponential convergence rates for Batch Normalization: The power of
  length-direction decoupling in non-convex optimization
Exponential convergence rates for Batch Normalization: The power of length-direction decoupling in non-convex optimization
Jonas Köhler
Hadi Daneshmand
Aurelien Lucchi
M. Zhou
K. Neymeyr
Thomas Hofmann
20
91
0
27 May 2018
Stereo Magnification: Learning View Synthesis using Multiplane Images
Stereo Magnification: Learning View Synthesis using Multiplane Images
Tinghui Zhou
Richard Tucker
John Flynn
Graham Fyffe
Noah Snavely
3DV
11
380
0
24 May 2018
Learning towards Minimum Hyperspherical Energy
Learning towards Minimum Hyperspherical Energy
Weiyang Liu
Rongmei Lin
Ziqiang Liu
Lixin Liu
Zhiding Yu
Bo Dai
Le Song
30
145
0
23 May 2018
Generalisation of structural knowledge in the hippocampal-entorhinal
  system
Generalisation of structural knowledge in the hippocampal-entorhinal system
James C. R. Whittington
Timothy H. Muller
Shirley Mark
Caswell Barry
Timothy Edward John Behrens
21
50
0
23 May 2018
AffinityNet: semi-supervised few-shot learning for disease type
  prediction
AffinityNet: semi-supervised few-shot learning for disease type prediction
Tianle Ma
A. Zhang
21
55
0
22 May 2018
Batch-Instance Normalization for Adaptively Style-Invariant Neural
  Networks
Batch-Instance Normalization for Adaptively Style-Invariant Neural Networks
Hyeonseob Nam
Hyo-Eun Kim
OOD
24
208
0
21 May 2018
Evolution-Guided Policy Gradient in Reinforcement Learning
Evolution-Guided Policy Gradient in Reinforcement Learning
Shauharda Khadka
Kagan Tumer
19
223
0
21 May 2018
Multi-view Sentence Representation Learning
Multi-view Sentence Representation Learning
Shuai Tang
V. D. Sa
SSL
19
3
0
18 May 2018
Batch Normalization in the final layer of generative networks
Batch Normalization in the final layer of generative networks
Sean Mullery
P. Whelan
VLM
GAN
17
5
0
18 May 2018
Towards Robust Neural Machine Translation
Towards Robust Neural Machine Translation
Yong Cheng
Zhaopeng Tu
Fandong Meng
Junjie Zhai
Yang Liu
AAML
19
161
0
16 May 2018
Hierarchical Neural Story Generation
Hierarchical Neural Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
DiffM
60
1,586
0
13 May 2018
Deep Neural Machine Translation with Weakly-Recurrent Units
Deep Neural Machine Translation with Weakly-Recurrent Units
Mattia Antonino Di Gangi
Marcello Federico
AIMat
25
19
0
10 May 2018
A comparable study of modeling units for end-to-end Mandarin speech
  recognition
A comparable study of modeling units for end-to-end Mandarin speech recognition
Wei Zou
Dongwei Jiang
Shuaijiang Zhao
Xiangang Li
24
32
0
10 May 2018
Decoding Decoders: Finding Optimal Representation Spaces for
  Unsupervised Similarity Tasks
Decoding Decoders: Finding Optimal Representation Spaces for Unsupervised Similarity Tasks
V. Zhelezniak
Dan Busbridge
April Shen
Samuel L. Smith
Nils Y. Hammerla
SSL
14
4
0
09 May 2018
Unsupervised learning for concept detection in medical images: a
  comparative analysis
Unsupervised learning for concept detection in medical images: a comparative analysis
Eduardo Pinho
C. Costa
SSL
25
11
0
04 May 2018
Upping the Ante: Towards a Better Benchmark for Chinese-to-English
  Machine Translation
Upping the Ante: Towards a Better Benchmark for Chinese-to-English Machine Translation
Christian Hadiwinoto
Hwee Tou Ng
ELM
6
4
0
04 May 2018
Noisin: Unbiased Regularization for Recurrent Neural Networks
Noisin: Unbiased Regularization for Recurrent Neural Networks
Adji Bousso Dieng
Rajesh Ranganath
Jaan Altosaar
David M. Blei
22
22
0
03 May 2018
What you can cram into a single vector: Probing sentence embeddings for
  linguistic properties
What you can cram into a single vector: Probing sentence embeddings for linguistic properties
Alexis Conneau
Germán Kruszewski
Guillaume Lample
Loïc Barrault
Marco Baroni
201
882
0
03 May 2018
Constituency Parsing with a Self-Attentive Encoder
Constituency Parsing with a Self-Attentive Encoder
Nikita Kitaev
Dan Klein
30
535
0
02 May 2018
Semi-parametric Image Synthesis
Semi-parametric Image Synthesis
Xiaojuan Qi
Qifeng Chen
Jiaya Jia
V. Koltun
19
169
0
29 Apr 2018
The Best of Both Worlds: Combining Recent Advances in Neural Machine
  Translation
The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation
Mengzhao Chen
Orhan Firat
Ankur Bapna
Melvin Johnson
Wolfgang Macherey
...
Niki Parmar
M. Schuster
Zhifeng Chen
Yonghui Wu
Macduff Hughes
AIMat
21
457
0
26 Apr 2018
Personalized Language Model for Query Auto-Completion
Personalized Language Model for Query Auto-Completion
Aaron Jaech
Mari Ostendorf
RALM
11
65
0
25 Apr 2018
Decorrelated Batch Normalization
Decorrelated Batch Normalization
Lei Huang
Dawei Yang
B. Lang
Jia Deng
16
190
0
23 Apr 2018
QANet: Combining Local Convolution with Global Self-Attention for
  Reading Comprehension
QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension
Adams Wei Yu
David Dohan
Minh-Thang Luong
Rui Zhao
Kai Chen
Mohammad Norouzi
Quoc V. Le
RALM
AIMat
35
1,092
0
23 Apr 2018
Linguistically-Informed Self-Attention for Semantic Role Labeling
Linguistically-Informed Self-Attention for Semantic Role Labeling
Emma Strubell
Pat Verga
D. Andor
David J. Weiss
Andrew McCallum
OffRL
18
379
0
23 Apr 2018
Multi-task Learning for Universal Sentence Embeddings: A Thorough
  Evaluation using Transfer and Auxiliary Tasks
Multi-task Learning for Universal Sentence Embeddings: A Thorough Evaluation using Transfer and Auxiliary Tasks
Wasi Uddin Ahmad
Xueying Bai
Zhechao Huang
Chao Jiang
Nanyun Peng
Kai-Wei Chang
SSL
33
6
0
21 Apr 2018
Factorising AMR generation through syntax
Factorising AMR generation through syntax
Kris Cao
S. Clark
LRM
12
20
0
20 Apr 2018
Revisiting Small Batch Training for Deep Neural Networks
Revisiting Small Batch Training for Deep Neural Networks
Dominic Masters
Carlo Luschi
ODL
37
661
0
20 Apr 2018
Fast Weight Long Short-Term Memory
Fast Weight Long Short-Term Memory
Thomas Anderson Keller
S. N. Sridhar
Xin Wang
23
1
0
18 Apr 2018
A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious
  Web Content
A Deep Learning Approach to Fast, Format-Agnostic Detection of Malicious Web Content
Joshua Saxe
Richard E. Harang
Cody Wild
Hillary Sanders
14
32
0
13 Apr 2018
Large scale distributed neural network training through online
  distillation
Large scale distributed neural network training through online distillation
Rohan Anil
Gabriel Pereyra
Alexandre Passos
Róbert Ormándi
George E. Dahl
Geoffrey E. Hinton
FedML
278
404
0
09 Apr 2018
On the Robustness of Speech Emotion Recognition for Human-Robot
  Interaction with Deep Neural Networks
On the Robustness of Speech Emotion Recognition for Human-Robot Interaction with Deep Neural Networks
Egor Lakomkin
M. Zamani
C. Weber
S. Magg
S. Wermter
14
52
0
06 Apr 2018
Finding beans in burgers: Deep semantic-visual embedding with
  localization
Finding beans in burgers: Deep semantic-visual embedding with localization
Martin Engilberge
Louis Chevallier
P. Pérez
Matthieu Cord
6
95
0
05 Apr 2018
Real-Time Prediction of the Duration of Distribution System Outages
Real-Time Prediction of the Duration of Distribution System Outages
Aaron Jaech
Baosen Zhang
Mari Ostendorf
D. Kirschen
16
74
0
03 Apr 2018
End-to-End Dense Video Captioning with Masked Transformer
End-to-End Dense Video Captioning with Masked Transformer
Luowei Zhou
Yingbo Zhou
Jason J. Corso
R. Socher
Caiming Xiong
25
524
0
03 Apr 2018
Universal Planning Networks
Universal Planning Networks
A. Srinivas
Allan Jabri
Pieter Abbeel
Sergey Levine
Chelsea Finn
SSL
25
145
0
02 Apr 2018
Learning to Run challenge solutions: Adapting reinforcement learning
  methods for neuromusculoskeletal environments
Learning to Run challenge solutions: Adapting reinforcement learning methods for neuromusculoskeletal environments
L. Kidzinski
Sharada Mohanty
Carmichael F. Ong
Zhewei Huang
Shuchang Zhou
...
Sean F. Carroll
Jennifer Hicks
Sergey Levine
M. Salathé
Scott L. Delp
34
87
0
02 Apr 2018
Marian: Fast Neural Machine Translation in C++
Marian: Fast Neural Machine Translation in C++
Marcin Junczys-Dowmunt
Roman Grundkiewicz
Tomasz Dwojak
Hieu T. Hoang
Kenneth Heafield
...
Ulrich Germann
Alham Fikri Aji
Nikolay Bogoychev
André F. T. Martins
Alexandra Birch
12
710
0
01 Apr 2018
Substitute Teacher Networks: Learning with Almost No Supervision
Substitute Teacher Networks: Learning with Almost No Supervision
Samuel Albanie
James Thewlis
Joao F. Henriques
14
2
0
01 Apr 2018
Training Tips for the Transformer Model
Training Tips for the Transformer Model
Martin Popel
Ondrej Bojar
12
306
0
01 Apr 2018
Generative Modeling using the Sliced Wasserstein Distance
Generative Modeling using the Sliced Wasserstein Distance
Ishani Deshpande
Ziyu Zhang
A. Schwing
GAN
20
221
0
29 Mar 2018
Normalization of Neural Networks using Analytic Variance Propagation
Normalization of Neural Networks using Analytic Variance Propagation
Alexander Shekhovtsov
B. Flach
19
6
0
28 Mar 2018
Fast Parametric Learning with Activation Memorization
Fast Parametric Learning with Activation Memorization
Jack W. Rae
Chris Dyer
Peter Dayan
Timothy Lillicrap
KELM
41
46
0
27 Mar 2018
Learning the Multiple Traveling Salesmen Problem with Permutation
  Invariant Pooling Networks
Learning the Multiple Traveling Salesmen Problem with Permutation Invariant Pooling Networks
Yoav Kaempfer
Lior Wolf
27
71
0
26 Mar 2018
One-Shot Segmentation in Clutter
One-Shot Segmentation in Clutter
Claudio Michaelis
Matthias Bethge
Alexander S. Ecker
28
40
0
26 Mar 2018
Previous
123...105106107...109110111
Next