Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1708.02182
Cited By
Regularizing and Optimizing LSTM Language Models
7 August 2017
Stephen Merity
N. Keskar
R. Socher
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Regularizing and Optimizing LSTM Language Models"
50 / 509 papers shown
Title
Syntax-driven Iterative Expansion Language Models for Controllable Text Generation
Noe Casas
José A. R. Fonollosa
Marta R. Costa-jussá
19
11
0
05 Apr 2020
Improving Perceptual Quality of Drum Transcription with the Expanded Groove MIDI Dataset
Lee F. Callender
Curtis Hawthorne
Jesse Engel
43
20
0
01 Apr 2020
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OOD
AI4CE
40
120
0
26 Mar 2020
Multi-Class classification of vulnerabilities in Smart Contracts using AWD-LSTM, with pre-trained encoder inspired from natural language processing
Ajay K. Gogineni
S. Swayamjyoti
Devadatta Sahoo
K. Sahu
R. Kishore
31
32
0
21 Mar 2020
Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies
Paul Pu Liang
Manzil Zaheer
Yuan Wang
Amr Ahmed
BDL
21
1
0
18 Mar 2020
Iterative Averaging in the Quest for Best Test Error
Diego Granziol
Xingchen Wan
Samuel Albanie
Stephen J. Roberts
10
3
0
02 Mar 2020
Tensor Networks for Probabilistic Sequence Modeling
Jacob Miller
Guillaume Rabusseau
John Terilla
16
5
0
02 Mar 2020
The Implicit and Explicit Regularization Effects of Dropout
Colin Wei
Sham Kakade
Tengyu Ma
30
114
0
28 Feb 2020
Temporal Convolutional Attention-based Network For Sequence Modeling
Hongyan Hao
Yan Wang
Siqiao Xue
Yudi Xia
Jian Zhao
S. Furao
30
41
0
28 Feb 2020
Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity
Thomas Miconi
Aditya Rawal
Jeff Clune
Kenneth O. Stanley
13
90
0
24 Feb 2020
Addressing Some Limitations of Transformers with Feedback Memory
Angela Fan
Thibaut Lavril
Edouard Grave
Armand Joulin
Sainbayar Sukhbaatar
26
11
0
21 Feb 2020
MaxUp: A Simple Way to Improve Generalization of Neural Network Training
Chengyue Gong
Tongzheng Ren
Mao Ye
Qiang Liu
AAML
27
56
0
20 Feb 2020
A Systematic Comparison of Architectures for Document-Level Sentiment Classification
Jeremy Barnes
Vinit Ravishankar
Lilja Ovrelid
Erik Velldal
8
0
0
19 Feb 2020
SentenceMIM: A Latent Variable Language Model
M. Livne
Kevin Swersky
David J. Fleet
VLM
49
6
0
18 Feb 2020
Transformer on a Diet
Chenguang Wang
Zihao Ye
Aston Zhang
Zheng-Wei Zhang
Alex Smola
32
8
0
14 Feb 2020
Deep Learning for Source Code Modeling and Generation: Models, Applications and Challenges
T. H. Le
Hao Chen
Muhammad Ali Babar
VLM
64
152
0
13 Feb 2020
fastai: A Layered API for Deep Learning
Jeremy Howard
Sylvain Gugger
AI4CE
20
857
0
11 Feb 2020
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha Singh
Nirmalya Roy
A. Gangopadhyay
27
6
0
10 Feb 2020
Understanding and Improving Knowledge Distillation
Jiaxi Tang
Rakesh Shivanna
Zhe Zhao
Dong Lin
Anima Singh
Ed H. Chi
Sagar Jain
27
129
0
10 Feb 2020
Blank Language Models
T. Shen
Victor Quach
Regina Barzilay
Tommi Jaakkola
203
73
0
08 Feb 2020
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding
Sean Welleck
Ilia Kulikov
Jaedeok Kim
Richard Yuanzhe Pang
Kyunghyun Cho
17
65
0
06 Feb 2020
SQWA: Stochastic Quantized Weight Averaging for Improving the Generalization Capability of Low-Precision Deep Neural Networks
Sungho Shin
Yoonho Boo
Wonyong Sung
MQ
27
3
0
02 Feb 2020
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
Weizhen Qi
Yu Yan
Yeyun Gong
Dayiheng Liu
Nan Duan
Jiusheng Chen
Ruofei Zhang
Ming Zhou
AI4TS
27
446
0
13 Jan 2020
A Continuous Space Neural Language Model for Bengali Language
Hemayet Ahmed Chowdhury
Md. Azizul Haque Imon
Anisur Rahman
Aisha Khatun
Md. Saiful Islam
19
2
0
11 Jan 2020
CProp: Adaptive Learning Rate Scaling from Past Gradient Conformity
Konpat Preechakul
B. Kijsirikul
ODL
30
3
0
24 Dec 2019
Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin using Recursive Neural Networks
Minh Nguyen
G. Ngo
Nancy F. Chen
19
19
0
20 Dec 2019
Just Add Functions: A Neural-Symbolic Language Model
David Demeter
Doug Downey
8
11
0
11 Dec 2019
Why are Adaptive Methods Good for Attention Models?
J.N. Zhang
Sai Praneeth Karimireddy
Andreas Veit
Seungyeon Kim
Sashank J. Reddi
Surinder Kumar
S. Sra
18
79
0
06 Dec 2019
Fantastic Generalization Measures and Where to Find Them
Yiding Jiang
Behnam Neyshabur
H. Mobahi
Dilip Krishnan
Samy Bengio
AI4CE
14
596
0
04 Dec 2019
Domain-independent Dominance of Adaptive Methods
Pedro H. P. Savarese
David A. McAllester
Sudarshan Babu
Michael Maire
ODL
18
22
0
04 Dec 2019
A Comparative Study of Pretrained Language Models on Thai Social Text Categorization
Thanapapas Horsuwan
Kasidis Kanwatchara
P. Vateekul
B. Kijsirikul
14
9
0
03 Dec 2019
TX-Ray: Quantifying and Explaining Model-Knowledge Transfer in (Un-)Supervised NLP
Nils Rethmeier
V. Saxena
Isabelle Augenstein
FAtt
25
2
0
02 Dec 2019
How Can We Know What Language Models Know?
Zhengbao Jiang
Frank F. Xu
Jun Araki
Graham Neubig
KELM
41
1,373
0
28 Nov 2019
SimpleBooks: Long-term dependency book dataset with simplified English vocabulary for word-level language modeling
Huyen Nguyen
9
2
0
27 Nov 2019
DeFINE: DEep Factorized INput Token Embeddings for Neural Sequence Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
AI4TS
38
23
0
27 Nov 2019
Autoencoding Undirected Molecular Graphs With Neural Networks
Jeppe Johan Waarkjaer Olsen
Peter Ebert Christensen
Martin Hangaard Hansen
alexander rosenberg johansen
AI4CE
19
0
0
26 Nov 2019
Relevance-Promoting Language Model for Short-Text Conversation
Xin Li
Piji Li
Wei Bi
Xiaojiang Liu
Wai Lam
16
11
0
26 Nov 2019
Single Headed Attention RNN: Stop Thinking With Your Head
Stephen Merity
27
68
0
26 Nov 2019
AutoShrink: A Topology-aware NAS for Discovering Efficient Neural Architecture
Tunhou Zhang
Hsin-Pai Cheng
Zhenwen Li
Feng Yan
Chengyu Huang
H. Li
Yiran Chen
10
9
0
21 Nov 2019
Thick-Net: Parallel Network Structure for Sequential Modeling
Yu-Xuan Li
Jin-Yuan Liu
Liang Li
Xiang Guan
19
0
0
19 Nov 2019
RotationOut as a Regularization Method for Neural Network
Kaiqin Hu
Barnabás Póczós
33
1
0
18 Nov 2019
Multi-Zone Unit for Recurrent Neural Networks
Fandong Meng
Jinchao Zhang
Yang Liu
Jie Zhou
AI4CE
19
2
0
17 Nov 2019
A Subword Level Language Model for Bangla Language
Aisha Khatun
Anisur Rahman
Hemayet Ahmed Chowdhury
Md. Saiful Islam
A. Tasnim
17
4
0
15 Nov 2019
Exploiting Token and Path-based Representations of Code for Identifying Security-Relevant Commits
Achyudh Ram
Ji Xin
M. Nagappan
Yaoliang Yu
Rocío Cabrera Lozoya
A. Sabetta
Jimmy J. Lin
27
3
0
15 Nov 2019
KEPLER: A Unified Model for Knowledge Embedding and Pre-trained Language Representation
Xiaozhi Wang
Tianyu Gao
Zhaocheng Zhu
Zhengyan Zhang
Zhiyuan Liu
Juan-Zi Li
Jian Tang
15
647
0
13 Nov 2019
Optimizing Millions of Hyperparameters by Implicit Differentiation
Jonathan Lorraine
Paul Vicol
David Duvenaud
DD
30
403
0
06 Nov 2019
On the Effectiveness of the Pooling Methods for Biomedical Relation Extraction with Deep Learning
Tuan Ngo Nguyen
Franck Dernoncourt
Thien Huu Nguyen
19
5
0
04 Nov 2019
An Adaptive and Momental Bound Method for Stochastic Learning
Jianbang Ding
Xuancheng Ren
Ruixuan Luo
Xu Sun
ODL
19
46
0
27 Oct 2019
FineText: Text Classification via Attention-based Language Model Fine-tuning
Yunzhe Tao
Saurabh Gupta
Satyapriya Krishna
Xiong Zhou
Orchid Majumder
Vineet Khare
21
3
0
25 Oct 2019
Efficient Decoupled Neural Architecture Search by Structure and Operation Sampling
Heung-Chang Lee
Do-Guk Kim
Bohyung Han
38
6
0
23 Oct 2019
Previous
1
2
3
...
5
6
7
...
9
10
11
Next