Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.02182
Cited By
Regularizing and Optimizing LSTM Language Models
7 August 2017
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing and Optimizing LSTM Language Models"
50 / 509 papers shown
Title
"BNN - BN = ?": Training Binary Neural Networks without Batch Normalization
Tianlong Chen
Zhenyu Zhang
Xu Ouyang
Zechun Liu
Zhiqiang Shen
Zhangyang Wang
MQ
43
36
0
16 Apr 2021
Broccoli: Sprinkling Lightweight Vocabulary Learning into Everyday Information Diets
Roland Aydin
Lars Klein
Arnaud Miribel
Robert West
16
1
0
16 Apr 2021
RIANN -- A Robust Neural Network Outperforms Attitude Estimation Filters
Daniel Weber
C. Gühmann
Thomas Seel
20
35
0
15 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
25
84
0
13 Apr 2021
Evaluating Saliency Methods for Neural Language Models
Shuoyang Ding
Philipp Koehn
FAtt
XAI
23
54
0
12 Apr 2021
Revisiting Simple Neural Probabilistic Language Models
Simeng Sun
Mohit Iyyer
24
14
0
08 Apr 2021
Rethinking Perturbations in Encoder-Decoders for Fast Training
Sho Takase
Shun Kiyono
33
45
0
05 Apr 2021
Low-Resource Language Modelling of South African Languages
Stuart Mesham
Luc Hayward
Jared Shapiro
Jan Buys
4
14
0
01 Apr 2021
Data Augmentation in a Hybrid Approach for Aspect-Based Sentiment Analysis
Tomas Liesting
Flavius Frasincar
Maria Mihaela Truşcǎ
18
30
0
29 Mar 2021
Data Augmentation in Natural Language Processing: A Novel Text Generation Approach for Long and Short Text Classifiers
Markus Bayer
M. Kaufhold
Björn Buchhold
Marcel Keller
J. Dallmeyer
Christian A. Reuter
31
113
0
26 Mar 2021
ESCORT: Ethereum Smart COntRacTs Vulnerability Detection using Deep Neural Network and Transfer Learning
O. Lutz
Huili Chen
Hossein Fereidooni
Christoph Sendner
Alexandra Dmitrienko
A. Sadeghi
F. Koushanfar
15
46
0
23 Mar 2021
Token-wise Curriculum Learning for Neural Machine Translation
Chen Liang
Haoming Jiang
Xiaodong Liu
Pengcheng He
Weizhu Chen
Jianfeng Gao
T. Zhao
21
4
0
20 Mar 2021
Improving Authorship Verification using Linguistic Divergence
Yifan Zhang
Dainis Boumber
Marjan Hosseinia
Fan Yang
Arjun Mukherjee
12
1
0
12 Mar 2021
Nondeterminism and Instability in Neural Network Optimization
Cecilia Summers
M. Dinneen
27
38
0
08 Mar 2021
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
36
348
0
03 Mar 2021
indicnlp@kgp at DravidianLangTech-EACL2021: Offensive Language Identification in Dravidian Languages
Kushal Kedia
Abhilash Nandy
24
23
0
14 Feb 2021
High-Performance Large-Scale Image Recognition Without Normalization
Andrew Brock
Soham De
Samuel L. Smith
Karen Simonyan
VLM
223
512
0
11 Feb 2021
Train your classifier first: Cascade Neural Networks Training from upper layers to lower layers
Shucong Zhang
Cong-Thanh Do
R. Doddipatla
Erfan Loweimi
P. Bell
Steve Renals
24
2
0
09 Feb 2021
Eliminating Sharp Minima from SGD with Truncated Heavy-tailed Noise
Xingyu Wang
Sewoong Oh
C. Rhee
13
13
0
08 Feb 2021
A Comprehensive Survey on Hardware-Aware Neural Architecture Search
Hadjer Benmeziane
Kaoutar El Maghraoui
Hamza Ouarnoughi
Smail Niar
Martin Wistuba
Naigang Wang
34
96
0
22 Jan 2021
Detecting Hostile Posts using Relational Graph Convolutional Network
Sarthak
Shikhar Shukla
K. V. Arya
GNN
11
2
0
10 Jan 2021
AutoDropout: Learning Dropout Patterns to Regularize Deep Networks
Hieu H. Pham
Quoc V. Le
76
56
0
05 Jan 2021
Leveraging Audio Gestalt to Predict Media Memorability
Lorin Sweeney
Graham Healy
Alan F. Smeaton
29
6
0
31 Dec 2020
Contextual Temperature for Language Modeling
Pei-Hsin Wang
Sheng-Iou Hsieh
Shih-Chieh Chang
Yu-Ting Chen
Jia-Yu Pan
Wei Wei
Da-Chang Juan
45
25
0
25 Dec 2020
Optimizing Deep Neural Networks through Neuroevolution with Stochastic Gradient Descent
Haichao Zhang
K. Hao
Lei Gao
Bing Wei
Xue-song Tang
19
12
0
21 Dec 2020
Recent advances in deep learning theory
Fengxiang He
Dacheng Tao
AI4CE
24
50
0
20 Dec 2020
Data-Efficient Methods for Dialogue Systems
Igor Shalyminov
14
0
0
05 Dec 2020
End to End ASR System with Automatic Punctuation Insertion
Yushi Guan
3DV
27
5
0
03 Dec 2020
Mutual Information Constraints for Monte-Carlo Objectives
Gábor Melis
András Gyorgy
Phil Blunsom
21
1
0
01 Dec 2020
Regularizing Recurrent Neural Networks via Sequence Mixup
Armin Karamzade
Amir Najafi
S. Motahari
16
0
0
27 Nov 2020
Learning Associative Inference Using Fast Weight Memory
Imanol Schlag
Tsendsuren Munkhdalai
Jürgen Schmidhuber
KELM
30
44
0
16 Nov 2020
DORB: Dynamically Optimizing Multiple Rewards with Bandits
Ramakanth Pasunuru
Han Guo
Joey Tianyi Zhou
OffRL
32
6
0
15 Nov 2020
Exploring the Value of Personalized Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
17
15
0
11 Nov 2020
Scaling Hidden Markov Language Models
Justin T. Chiu
Alexander M. Rush
BDL
22
25
0
09 Nov 2020
Fusion Models for Improved Visual Captioning
M. Kalimuthu
Aditya Mogadala
Marius Mosbach
Dietrich Klakow
VLM
26
0
0
28 Oct 2020
Delta-STN: Efficient Bilevel Optimization for Neural Networks using Structured Response Jacobians
Juhan Bae
Roger C. Grosse
27
24
0
26 Oct 2020
Revisiting Neural Language Modelling with Syllables
Arturo Oncevay
Kervy Rivas Rojas
18
2
0
24 Oct 2020
Large Scale Legal Text Classification Using Transformer Models
Zein Shaheen
G. Wohlgenannt
Erwin Filtz
AILaw
32
67
0
24 Oct 2020
On Convergence and Generalization of Dropout Training
Poorya Mianjy
R. Arora
37
30
0
23 Oct 2020
Exploiting News Article Structure for Automatic Corpus Generation of Entailment Datasets
Jan Christian Blaise Cruz
Jose Kristian Resabal
James Lin
Dan John Velasco
C. Cheng
6
11
0
22 Oct 2020
Cascaded Models With Cyclic Feedback For Direct Speech Translation
Tsz Kin Lam
Shigehiko Schamoni
Stefan Riezler
32
12
0
21 Oct 2020
Adaptive Gradient Method with Resilience and Momentum
Jie Liu
Chen Lin
Chuming Li
Lu Sheng
Ming Sun
Junjie Yan
Wanli Ouyang
ODL
14
0
0
21 Oct 2020
Complaint Identification in Social Media with Transformer Networks
Mali Jin
Nikolaos Aletras
12
16
0
21 Oct 2020
Where's the Question? A Multi-channel Deep Convolutional Neural Network for Question Identification in Textual Data
George Michalopoulos
Helen H. Chen
Alexander Wong
MedIm
17
1
0
15 Oct 2020
Pagsusuri ng RNN-based Transfer Learning Technique sa Low-Resource Language
Dan John Velasco
9
3
0
13 Oct 2020
Towards Theoretically Understanding Why SGD Generalizes Better Than ADAM in Deep Learning
Pan Zhou
Jiashi Feng
Chao Ma
Caiming Xiong
Guosheng Lin
E. Weinan
25
228
0
12 Oct 2020
Compositional Demographic Word Embeddings
Charles F Welch
Jonathan K. Kummerfeld
Verónica Pérez-Rosas
Rada Mihalcea
21
31
0
06 Oct 2020
On the Branching Bias of Syntax Extracted from Pre-trained Language Models
Huayang Li
Lemao Liu
Guoping Huang
Shuming Shi
23
6
0
06 Oct 2020
Gauravarora@HASOC-Dravidian-CodeMix-FIRE2020: Pre-training ULMFiT on Synthetically Generated Code-Mixed Data for Hate Speech Detection
Gaurav Arora
14
27
0
05 Oct 2020
Improved Analysis of Clipping Algorithms for Non-convex Optimization
Bohang Zhang
Jikai Jin
Cong Fang
Liwei Wang
38
87
0
05 Oct 2020
Previous
1
2
3
4
5
...
9
10
11
Next