Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.07843
Cited By
Pointer Sentinel Mixture Models
26 September 2016
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pointer Sentinel Mixture Models"
50 / 706 papers shown
Title
Improving Neural Network Quantization without Retraining using Outlier Channel Splitting
Ritchie Zhao
Yuwei Hu
Jordan Dotzel
Christopher De Sa
Zhiru Zhang
OODD
MQ
50
305
0
28 Jan 2019
Global-to-local Memory Pointer Networks for Task-Oriented Dialogue
Chien-Sheng Wu
R. Socher
Caiming Xiong
24
165
0
15 Jan 2019
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context
Zihang Dai
Zhilin Yang
Yiming Yang
J. Carbonell
Quoc V. Le
Ruslan Salakhutdinov
VLM
38
3,679
0
09 Jan 2019
Choosing the Right Word: Using Bidirectional LSTM Tagger for Writing Support Systems
Victor Makarenkov
Lior Rokach
Bracha Shapira
24
35
0
08 Jan 2019
RNNs Implicitly Implement Tensor Product Representations
R. Thomas McCoy
Tal Linzen
Ewan Dunbar
P. Smolensky
49
54
0
20 Dec 2018
Learning Private Neural Language Modeling with Attentive Aggregation
Shaoxiong Ji
Shirui Pan
Guodong Long
Xue Li
Jing Jiang
Zi Huang
FedML
MoMe
16
136
0
17 Dec 2018
Can I trust you more? Model-Agnostic Hierarchical Explanations
Michael Tsang
Youbang Sun
Dongxu Ren
Yan Liu
FAtt
16
25
0
12 Dec 2018
Parameter Re-Initialization through Cyclical Batch Size Schedules
Norman Mu
Z. Yao
A. Gholami
Kurt Keutzer
Michael W. Mahoney
ODL
30
8
0
04 Dec 2018
Non-entailed subsequences as a challenge for natural language inference
R. Thomas McCoy
Tal Linzen
19
18
0
29 Nov 2018
Ordered Neurons: Integrating Tree Structures into Recurrent Neural Networks
Songlin Yang
Shawn Tan
Alessandro Sordoni
Aaron Courville
32
323
0
22 Oct 2018
Fast deep reinforcement learning using online adjustments from the past
Steven Hansen
Pablo Sprechmann
Alexander Pritzel
André Barreto
Charles Blundell
TTA
OffRL
OnRL
18
42
0
18 Oct 2018
Quasi-hyperbolic momentum and Adam for deep learning
Jerry Ma
Denis Yarats
ODL
84
129
0
16 Oct 2018
Trellis Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
25
145
0
15 Oct 2018
Persistence pays off: Paying Attention to What the LSTM Gating Mechanism Persists
Giancarlo D. Salton
John D. Kelleher
KELM
RALM
26
6
0
10 Oct 2018
Understanding Recurrent Neural Architectures by Analyzing and Synthesizing Long Distance Dependencies in Benchmark Sequential Datasets
Abhijit Mahalunkar
John D. Kelleher
24
8
0
06 Oct 2018
Learning Compressed Transforms with Low Displacement Rank
Anna T. Thomas
Albert Gu
Tri Dao
Atri Rudra
Christopher Ré
27
40
0
04 Oct 2018
Adaptive Input Representations for Neural Language Modeling
Alexei Baevski
Michael Auli
29
388
0
28 Sep 2018
Information-Weighted Neural Cache Language Models for ASR
Lyan Verwimp
J. Pelemans
Hugo Van hamme
P. Wambacq
KELM
RALM
16
2
0
24 Sep 2018
Direct Output Connection for a High-Rank Language Model
Sho Takase
Jun Suzuki
Masaaki Nagata
18
36
0
30 Aug 2018
A Neural Model of Adaptation in Reading
Marten van Schijndel
Tal Linzen
24
62
0
29 Aug 2018
Pyramidal Recurrent Unit for Language Modeling
Sachin Mehta
Rik Koncel-Kedziorski
Mohammad Rastegari
Hannaneh Hajishirzi
21
10
0
27 Aug 2018
Don't Use Large Mini-Batches, Use Local SGD
Tao R. Lin
Sebastian U. Stich
Kumar Kshitij Patel
Martin Jaggi
57
429
0
22 Aug 2018
Multi-Source Pointer Network for Product Title Summarization
Fei Sun
Peng Jiang
Hanxiao Sun
Changhua Pei
Wenwu Ou
Xiaobo Wang
31
47
0
21 Aug 2018
Persistent Hidden States and Nonlinear Transformation for Long Short-Term Memory
Heeyoul Choi
24
12
0
22 Jun 2018
Navigating with Graph Representations for Fast and Scalable Decoding of Neural Language Models
Minjia Zhang
Xiaodong Liu
Wenhan Wang
Jianfeng Gao
Yuxiong He
23
30
0
11 Jun 2018
Relational recurrent neural networks
Adam Santoro
Ryan Faulkner
David Raposo
Jack W. Rae
Mike Chrzanowski
T. Weber
Daan Wierstra
Oriol Vinyals
Razvan Pascanu
Timothy Lillicrap
GNN
30
209
0
05 Jun 2018
Sigsoftmax: Reanalysis of the Softmax Bottleneck
Sekitoshi Kanai
Yasuhiro Fujiwara
Yuki Yamanaka
S. Adachi
19
68
0
28 May 2018
Numeracy for Language Models: Evaluating and Improving their Ability to Predict Numbers
Georgios P. Spithourakis
Sebastian Riedel
30
81
0
21 May 2018
Zero-Shot Dialog Generation with Cross-Domain Latent Actions
Tiancheng Zhao
M. Eskénazi
VLM
27
76
0
13 May 2018
Sharp Nearby, Fuzzy Far Away: How Neural Language Models Use Context
Urvashi Khandelwal
He He
Peng Qi
Dan Jurafsky
RALM
16
294
0
12 May 2018
Noisin: Unbiased Regularization for Recurrent Neural Networks
Adji Bousso Dieng
Rajesh Ranganath
Jaan Altosaar
David M. Blei
22
22
0
03 May 2018
Split and Rephrase: Better Evaluation and a Stronger Baseline
Roee Aharoni
Yoav Goldberg
MoE
226
45
0
02 May 2018
Meta-Learning a Dynamical Language Model
Thomas Wolf
Julien Chaumond
Clement Delangue
32
4
0
28 Mar 2018
Fast Parametric Learning with Activation Memorization
Jack W. Rae
Chris Dyer
Peter Dayan
Timothy Lillicrap
KELM
41
46
0
27 Mar 2018
An Analysis of Neural Language Modeling at Multiple Scales
Stephen Merity
N. Keskar
R. Socher
24
170
0
22 Mar 2018
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
Shaojie Bai
J. Zico Kolter
V. Koltun
DRL
42
4,731
0
04 Mar 2018
Memory-based Parameter Adaptation
Pablo Sprechmann
Siddhant M. Jayakumar
Jack W. Rae
Alexander Pritzel
Adria Puigdomenech Badia
Benigno Uria
Oriol Vinyals
Demis Hassabis
Razvan Pascanu
Charles Blundell
ODL
OOD
VLM
16
101
0
28 Feb 2018
Context Models for OOV Word Translation in Low-Resource Languages
Angli Liu
Katrin Kirchhoff
29
9
0
26 Jan 2018
Fix your classifier: the marginal value of training the last weight layer
Elad Hoffer
Itay Hubara
Daniel Soudry
35
101
0
14 Jan 2018
Improving Generalization Performance by Switching from Adam to SGD
N. Keskar
R. Socher
ODL
41
521
0
20 Dec 2017
Code Completion with Neural Attention and Pointer Networks
Jian Li
Yue Wang
M. Lyu
Irwin King
37
236
0
27 Nov 2017
Evaluating prose style transfer with the Bible
Keith Carlson
A. Riddell
D. Rockmore
30
53
0
13 Nov 2017
Neural Language Modeling by Jointly Learning Syntax and Lexicon
Songlin Yang
Zhouhan Lin
Chin-Wei Huang
Aaron Courville
46
178
0
02 Nov 2017
Learning Differentially Private Recurrent Language Models
H. B. McMahan
Daniel Ramage
Kunal Talwar
Li Zhang
FedML
30
125
0
18 Oct 2017
Seq2SQL: Generating Structured Queries from Natural Language using Reinforcement Learning
Victor Zhong
Caiming Xiong
R. Socher
RALM
43
1,164
0
31 Aug 2017
Regularizing and Optimizing LSTM Language Models
Stephen Merity
N. Keskar
R. Socher
95
1,091
0
07 Aug 2017
Revisiting Activation Regularization for Language RNNs
Stephen Merity
Bryan McCann
R. Socher
33
44
0
03 Aug 2017
Challenges in Data-to-Document Generation
Sam Wiseman
Stuart M. Shieber
Alexander M. Rush
42
581
0
25 Jul 2017
Shakespearizing Modern Language Using Copy-Enriched Sequence-to-Sequence Models
Harsh Jhamtani
Varun Gangal
Eduard H. Hovy
Eric Nyberg
41
175
0
04 Jul 2017
Deriving Neural Architectures from Sequence and Graph Kernels
Tao Lei
Wengong Jin
Regina Barzilay
Tommi Jaakkola
GNN
45
137
0
25 May 2017
Previous
1
2
3
...
13
14
15
Next