Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1602.02410
Cited By
Exploring the Limits of Language Modeling
7 February 2016
Rafal Jozefowicz
Oriol Vinyals
M. Schuster
Noam M. Shazeer
Yonghui Wu
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Exploring the Limits of Language Modeling"
50 / 167 papers shown
Title
Accelerating Large Scale Knowledge Distillation via Dynamic Importance Sampling
Minghan Li
Tanli Zuo
Ruicheng Li
Martha White
Weishi Zheng
29
3
0
03 Dec 2018
End-to-End Retrieval in Continuous Space
D. Gillick
Alessandro Presta
Gaurav Singh Tomar
14
101
0
19 Nov 2018
Unsupervised Transfer Learning for Spoken Language Understanding in Intelligent Agents
Aditya Siddhant
Anuj Kumar Goyal
A. Metallinou
9
50
0
13 Nov 2018
Federated Learning for Mobile Keyboard Prediction
Andrew Straiton Hard
Kanishka Rao
Zhifeng Lin
Swaroop Indra Ramaswamy
Youjie Li
S. Augenstein
A. Schwing
M. Annavaram
A. Avestimehr
FedML
53
1,511
0
08 Nov 2018
Content preserving text generation with attribute controls
Lajanugen Logeswaran
Honglak Lee
Samy Bengio
38
117
0
03 Nov 2018
Real-time Neural-based Input Method
Jiali Yao
Raphael Shu
Xinjian Li
K. Ohtsuki
Hideki Nakayama
6
4
0
19 Oct 2018
A Comprehensive Survey of Deep Learning for Image Captioning
Md Zakir Hossain
Ferdous Sohel
M. Shiratuddin
Hamid Laga
VLM
3DV
45
761
0
06 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
26
387
0
28 Sep 2018
Multi-task Learning with Sample Re-weighting for Machine Reading Comprehension
Yichong Xu
Xiaodong Liu
Yelong Shen
Jingjing Liu
Jianfeng Gao
23
51
0
18 Sep 2018
Self-Supervised Generation of Spatial Audio for 360 Video
Pedro Morgado
Nuno Vasconcelos
Timothy R. Langlois
Oliver Wang
MDE
24
171
0
07 Sep 2018
Noise Contrastive Estimation and Negative Sampling for Conditional Models: Consistency and Statistical Efficiency
Zhuang Ma
Michael Collins
14
142
0
06 Sep 2018
Dissecting Contextual Word Embeddings: Architecture and Representation
Matthew E. Peters
Mark Neumann
Luke Zettlemoyer
Wen-tau Yih
35
425
0
27 Aug 2018
Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension
Liang Wang
Sujian Li
Wei-Ye Zhao
Kewei Shen
Meng Sun
Ruoyu Jia
Jingming Liu
25
7
0
20 Aug 2018
Parallax: Sparsity-aware Data Parallel Training of Deep Neural Networks
Soojeong Kim
Gyeong-In Yu
Hojin Park
Sungwoo Cho
Eunji Jeong
Hyeonmin Ha
Sanha Lee
Joo Seong Jeong
Byung-Gon Chun
23
73
0
08 Aug 2018
Neural Arithmetic Logic Units
Andrew Trask
Felix Hill
Scott E. Reed
Jack W. Rae
Chris Dyer
Phil Blunsom
NAI
24
203
0
01 Aug 2018
Supporting Very Large Models using Automatic Dataflow Graph Partitioning
Minjie Wang
Chien-chin Huang
Jinyang Li
46
154
0
24 Jul 2018
Guess who? Multilingual approach for the automated generation of author-stylized poetry
Alexey Tikhonov
Ivan P. Yamshchikov
27
34
0
17 Jul 2018
Unsupervised and Efficient Vocabulary Expansion for Recurrent Neural Network Language Models in ASR
Yerbolat Khassanov
Chng Eng Siong
KELM
22
5
0
27 Jun 2018
PCA of high dimensional random walks with comparison to neural network training
J. Antognini
Jascha Narain Sohl-Dickstein
OOD
24
27
0
22 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
23
30
0
11 Jun 2018
Relational recurrent neural networks
Adam Santoro
Ryan Faulkner
David Raposo
Jack W. Rae
Mike Chrzanowski
T. Weber
Daan Wierstra
Oriol Vinyals
Razvan Pascanu
Timothy Lillicrap
GNN
30
209
0
05 Jun 2018
Analysis of DAWNBench, a Time-to-Accuracy Machine Learning Performance Benchmark
Cody Coleman
Daniel Kang
Deepak Narayanan
Luigi Nardi
Tian Zhao
Jian Zhang
Peter Bailis
K. Olukotun
Christopher Ré
Matei A. Zaharia
13
117
0
04 Jun 2018
Training LSTM Networks with Resistive Cross-Point Devices
Tayfun Gokmen
Malte J. Rasch
W. Haensch
8
45
0
01 Jun 2018
Hierarchical Neural Story Generation
Angela Fan
M. Lewis
Yann N. Dauphin
DiffM
60
1,586
0
13 May 2018
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
Urvashi Khandelwal
He He
Peng Qi
Dan Jurafsky
RALM
16
293
0
12 May 2018
Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling
Liyuan Liu
Xiang Ren
Jingbo Shang
Jian-wei Peng
Jiawei Han
25
44
0
20 Apr 2018
EmoRL: Continuous Acoustic Emotion Classification using Deep Reinforcement Learning
Egor Lakomkin
M. Zamani
C. Weber
S. Magg
S. Wermter
27
22
0
03 Apr 2018
Network Traffic Anomaly Detection Using Recurrent Neural Networks
Benjamin J. Radford
Leonardo M. Apolonio
Antonio J. Trias
Jim A. Simpson
11
105
0
28 Mar 2018
Fast Parametric Learning with Activation Memorization
Jack W. Rae
Chris Dyer
Peter Dayan
Timothy Lillicrap
KELM
41
46
0
27 Mar 2018
Deep neural decoders for near term fault-tolerant experiments
C. Chamberland
Pooya Ronagh
21
82
0
18 Feb 2018
Generative Models for Stochastic Processes Using Convolutional Neural Networks
Fernando Fernandes Neto
GAN
24
0
0
09 Jan 2018
SGAN: An Alternative Training of Generative Adversarial Networks
Tatjana Chavdarova
F. Fleuret
GAN
38
57
0
06 Dec 2017
Deep Learning Scaling is Predictable, Empirically
Joel Hestness
Sharan Narang
Newsha Ardalani
G. Diamos
Heewoo Jun
Hassan Kianinejad
Md. Mostofa Ali Patwary
Yang Yang
Yanqi Zhou
54
711
0
01 Dec 2017
MinimalRNN: Toward More Interpretable and Trainable Recurrent Neural Networks
Minmin Chen
30
28
0
18 Nov 2017
Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition
Shankar Kumar
M. Nirschl
D. Holtmann-Rice
H. Liao
A. Suresh
Felix X. Yu
KELM
33
40
0
15 Nov 2017
Mixed Precision Training
Paulius Micikevicius
Sharan Narang
Jonah Alben
G. Diamos
Erich Elsen
...
Boris Ginsburg
Michael Houston
Oleksii Kuchaiev
Ganesh Venkatesh
Hao Wu
90
1,764
0
10 Oct 2017
Language Modeling with Highway LSTM
Gakuto Kurata
Bhuvana Ramabhadran
G. Saon
A. Sethy
AI4TS
21
38
0
19 Sep 2017
Self-organized Hierarchical Softmax
Songlin Yang
Shawn Tan
C. Pal
Aaron Courville
BDL
38
7
0
26 Jul 2017
Device Placement Optimization with Reinforcement Learning
Azalia Mirhoseini
Hieu H. Pham
Quoc V. Le
Benoit Steiner
Rasmus Larsen
Yuefeng Zhou
Naveen Kumar
Mohammad Norouzi
Samy Bengio
J. Dean
27
436
0
13 Jun 2017
Spectral Norm Regularization for Improving the Generalizability of Deep Learning
Yuichi Yoshida
Takeru Miyato
35
325
0
31 May 2017
Semi-supervised sequence tagging with bidirectional language models
Matthew E. Peters
Bridger Waleed Ammar
Chandra Bhagavatula
Russell Power
19
634
0
29 Apr 2017
Affect-LM: A Neural Language Model for Customizable Affective Text Generation
Sayan Ghosh
Mathieu Chollet
Eugene Laksana
Louis-Philippe Morency
Stefan Scherer
KELM
CVBM
24
190
0
22 Apr 2017
What do Neural Machine Translation Models Learn about Morphology?
Yonatan Belinkov
Nadir Durrani
Fahim Dalvi
Hassan Sajjad
James R. Glass
61
410
0
11 Apr 2017
Learning to Generate Reviews and Discovering Sentiment
Alec Radford
Rafal Jozefowicz
Ilya Sutskever
44
504
0
05 Apr 2017
Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer
Noam M. Shazeer
Azalia Mirhoseini
Krzysztof Maziarz
Andy Davis
Quoc V. Le
Geoffrey E. Hinton
J. Dean
MoE
55
2,524
0
23 Jan 2017
Generating Focussed Molecule Libraries for Drug Discovery with Recurrent Neural Networks
Marwin H. S. Segler
T. Kogej
C. Tyrchan
M. Waller
30
96
0
05 Jan 2017
Language Modeling with Gated Convolutional Networks
Yann N. Dauphin
Angela Fan
Michael Auli
David Grangier
50
2,360
0
23 Dec 2016
Highway and Residual Networks learn Unrolled Iterative Estimation
Klaus Greff
R. Srivastava
Jürgen Schmidhuber
AI4TS
26
214
0
22 Dec 2016
Capacity and Trainability in Recurrent Neural Networks
Jasmine Collins
Jascha Narain Sohl-Dickstein
David Sussillo
35
203
0
29 Nov 2016
Variable Computation in Recurrent Neural Networks
Yacine Jernite
Edouard Grave
Armand Joulin
Tomáš Mikolov
32
59
0
18 Nov 2016
Previous
1
2
3
4
Next