Layer Normalization

21 July 2016

Jimmy Lei Ba

Papers citing "Layer Normalization"

50 / 5,528 papers shown

Title
Adaptive Prediction Timing for Electronic Health Records J. Deasy A. Ercole Pietro Lio OOD 19 1 0 05 Mar 2020
q-VAE for Disentangled Representation Learning and Latent Dynamical Systems Taisuke Kobayashis BDL DRL 41 17 0 04 Mar 2020
Deep Learning in Memristive Nanowire Networks Jack D. Kendall Ross D. Pantone J. Nino 6 2 0 03 Mar 2020
Batch Normalization Provably Avoids Rank Collapse for Randomly Initialised Deep Networks Hadi Daneshmand Jonas Köhler Francis R. Bach Thomas Hofmann Aurelien Lucchi OOD ODL 10 4 0 03 Mar 2020
Meta-Embeddings Based On Self-Attention Qichen Li Xiaoke Jiang Jun Xia Jian Li 21 2 0 03 Mar 2020
Curriculum By Smoothing Samarth Sinha Animesh Garg Hugo Larochelle 21 7 0 03 Mar 2020
Benchmarking Graph Neural Networks Vijay Prakash Dwivedi Chaitanya K. Joshi Anh Tuan Luu T. Laurent Yoshua Bengio Xavier Bresson 194 927 0 02 Mar 2020
Transformer++ Prakhar Thapak P. Hore 14 0 0 02 Mar 2020
Style Example-Guided Text Generation using Generative Adversarial Transformers Kuo-Hao Zeng Mohammad Shoeybi Ming-Yuan Liu GAN 23 18 0 02 Mar 2020
Unblind Your Apps: Predicting Natural-Language Labels for Mobile GUI Components by Deep Learning Jieshan Chen Chunyang Chen Zhenchang Xing Xiwei Xu Liming Zhu Guoqiang Li Jinshui Wang 19 139 0 01 Mar 2020
Channel Equilibrium Networks for Learning Deep Representation Wenqi Shao Shitao Tang Xingang Pan Ping Tan Xiaogang Wang Ping Luo 30 17 0 29 Feb 2020
Augmented Cyclic Consistency Regularization for Unpaired Image-to-Image Translation Takehiko Ohkawa Naoto Inoue Hirokatsu Kataoka Nakamasa Inoue 37 6 0 29 Feb 2020
Two Routes to Scalable Credit Assignment without Weight Symmetry D. Kunin Aran Nayebi Javier Sagastuy-Breña Surya Ganguli Jonathan M. Bloom Daniel L. K. Yamins 38 32 0 28 Feb 2020
RP-DNN: A Tweet level propagation context based deep neural networks for early rumor detection in Social Media Jie Gao Sooji Han Xingyi Song F. Ciravegna 28 20 0 28 Feb 2020
Modeling Future Cost for Neural Machine Translation Chaoqun Duan Kehai Chen Rui Wang Masao Utiyama Eiichiro Sumita Conghui Zhu Tiejun Zhao AI4TS 30 15 0 28 Feb 2020
Advances in Collaborative Filtering and Ranking Liwei Wu 22 7 0 27 Feb 2020
Deep Residual-Dense Lattice Network for Speech Enhancement M. Nikzad Aaron Nicolson Yongsheng Gao Jun Zhou K. Paliwal Fanhua Shang 14 38 0 27 Feb 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers Zhuohan Li Eric Wallace Sheng Shen Kevin Lin Kurt Keutzer Dan Klein Joseph E. Gonzalez 32 149 0 26 Feb 2020
Refined Gate: A Simple and Effective Gating Mechanism for Recurrent Units Zhanzhan Cheng Yunlu Xu Mingjian Cheng Yu Qiao Shiliang Pu Yi Niu Fei Wu 16 8 0 26 Feb 2020
On Feature Normalization and Data Augmentation Boyi Li Felix Wu Ser-Nam Lim Serge J. Belongie Kilian Q. Weinberger 26 134 0 25 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers Wenhui Wang Furu Wei Li Dong Hangbo Bao Nan Yang Ming Zhou VLM 54 1,224 0 25 Feb 2020
Layer-wise Conditioning Analysis in Exploring the Learning Dynamics of DNNs Lei Huang Jie Qin Li Liu Fan Zhu Ling Shao AI4CE 31 11 0 25 Feb 2020
Exploring BERT Parameter Efficiency on the Stanford Question Answering Dataset v2.0 Eric Hulburd 14 5 0 25 Feb 2020
Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training Weituo Hao Chunyuan Li Xiujun Li Lawrence Carin Jianfeng Gao LM&Ro 29 275 0 25 Feb 2020
Batch norm with entropic regularization turns deterministic autoencoders into generative models Amur Ghose Abdullah M. Rashwan Pascal Poupart UQCV 18 8 0 25 Feb 2020
Batch Normalization Biases Residual Blocks Towards the Identity Function in Deep Networks Soham De Samuel L. Smith ODL 32 20 0 24 Feb 2020
End-to-End Neural Diarization: Reformulating Speaker Diarization as Simple Multi-label Classification Yusuke Fujita Shinji Watanabe Shota Horiguchi Yawen Xue Kenji Nagamatsu 23 49 0 24 Feb 2020
GRET: Global Representation Enhanced Transformer Rongxiang Weng Hao-Ran Wei Shujian Huang Heng Yu Lidong Bing Weihua Luo Jiajun Chen 27 9 0 24 Feb 2020
On Hiding Neural Networks Inside Neural Networks Chuan Guo Ruihan Wu Kilian Q. Weinberger 12 5 0 24 Feb 2020
Interpretable Crowd Flow Prediction with Spatial-Temporal Self-Attention Haoxing Lin Weijia Jia Yongjian You Yiping Sun AI4TS 35 6 0 22 Feb 2020
Learning to Simulate Complex Physics with Graph Networks Alvaro Sanchez-Gonzalez Jonathan Godwin Tobias Pfaff Rex Ying J. Leskovec Peter W. Battaglia PINN AI4CE 70 1,062 0 21 Feb 2020
Addressing Some Limitations of Transformers with Feedback Memory Angela Fan Thibaut Lavril Edouard Grave Armand Joulin Sainbayar Sukhbaatar 33 11 0 21 Feb 2020
Transformer Hawkes Process Simiao Zuo Haoming Jiang Zichong Li T. Zhao H. Zha AI4TS 43 289 0 21 Feb 2020
AutoFoley: Artificial Synthesis of Synchronized Sound Tracks for Silent Videos with Deep Learning Sanchita Ghose John J. Prevost VGen 27 46 0 21 Feb 2020
Learning Dynamic Belief Graphs to Generalize on Text-Based Games Ashutosh Adhikari Xingdi Yuan Marc-Alexandre Côté M. Zelinka Marc-Antoine Rondeau Romain Laroche Pascal Poupart Jian Tang Adam Trischler William L. Hamilton AI4CE 40 81 0 21 Feb 2020
Adapted Center and Scale Prediction: More Stable and More Accurate Wenhao Wang 28 24 0 20 Feb 2020
Wavesplit: End-to-End Speech Separation by Speaker Clustering Neil Zeghidour David Grangier VLM 51 263 0 20 Feb 2020
A Novel Framework for Selection of GANs for an Application Tanya Motwani Manojkumar Somabhai Parmar 32 8 0 20 Feb 2020
Non-Autoregressive Dialog State Tracking Hung Le R. Socher Guosheng Lin 45 52 0 19 Feb 2020
A Survey of Deep Learning Techniques for Neural Machine Translation Shu Yang Yuxin Wang Xiaowen Chu VLM AI4TS AI4CE 38 138 0 18 Feb 2020
A New Clustering neural network for Chinese word segmentation Yuze Zhao 8 0 0 18 Feb 2020
Low-Rank Bottleneck in Multi-head Attention Models Srinadh Bhojanapalli Chulhee Yun A. S. Rawat Sashank J. Reddi Sanjiv Kumar 24 94 0 17 Feb 2020
Multi-layer Representation Fusion for Neural Machine Translation Qiang Wang Fuxue Li Tong Xiao Yanyang Li Yinqiao Li Jingbo Zhu AI4CE 33 52 0 16 Feb 2020
Neural Machine Translation with Joint Representation Yanyang Li Qiang Wang Tong Xiao Tongran Liu Jingbo Zhu 4 9 0 16 Feb 2020
Transformer on a Diet Chenguang Wang Zihao Ye Aston Zhang Zheng Zhang Alex Smola 32 8 0 14 Feb 2020
Towards an Appropriate Query, Key, and Value Computation for Knowledge Tracing Youngduck Choi Youngnam Lee Junghyun Cho Jineon Baek Byungsoo Kim Yeongmin Cha Dongmin Shin Chan Bae Jaewe Heo 14 198 0 14 Feb 2020
Cross-Iteration Batch Normalization Zhuliang Yao Yu Cao Shuxin Zheng Gao Huang Stephen Lin 19 85 0 13 Feb 2020
Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges T. H. Le Hao Chen Muhammad Ali Babar VLM 70 153 0 13 Feb 2020
Keyphrase Extraction with Span-based Feature Representations Funan Mu Zhenting Yu Lifeng Wang Yequan Wang Qingyu Yin Yibo Sun Liqun Liu Teng Ma Jing Tang Xing Zhou 53 17 0 13 Feb 2020
Regularizing activations in neural networks via distribution matching with the Wasserstein metric Taejong Joo Donggu Kang Byunghoon Kim 40 8 0 13 Feb 2020