Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1609.07843
Cited By
Pointer Sentinel Mixture Models
26 September 2016
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pointer Sentinel Mixture Models"
50 / 696 papers shown
Title
Adversarial Black-Box Attacks On Text Classifiers Using Multi-Objective Genetic Optimization Guided By Deep Networks
Alex Mathai
Shreya Khare
Srikanth G. Tamilselvam
Senthil Mani
AAML
36
6
0
08 Nov 2020
Concealed Data Poisoning Attacks on NLP Models
Eric Wallace
Tony Zhao
Shi Feng
Sameer Singh
SILM
19
18
0
23 Oct 2020
Limitations of Autoregressive Models and Their Alternatives
Chu-cheng Lin
Aaron Jaech
Xin Li
Matthew R. Gormley
Jason Eisner
29
58
0
22 Oct 2020
Cross Copy Network for Dialogue Generation
Changzhen Ji
Xiaoxia Zhou
Yating Zhang
Xiaozhong Liu
Changlong Sun
Conghui Zhu
T. Zhao
27
11
0
22 Oct 2020
Dual Averaging is Surprisingly Effective for Deep Learning Optimization
Samy Jelassi
Aaron Defazio
36
4
0
20 Oct 2020
Memformer: A Memory-Augmented Transformer for Sequence Modeling
Qingyang Wu
Zhenzhong Lan
Kun Qian
Jing Gu
A. Geramifard
Zhou Yu
22
49
0
14 Oct 2020
Are Some Words Worth More than Others?
Shiran Dudy
Steven Bedrick
18
14
0
12 Oct 2020
Plan ahead: Self-Supervised Text Planning for Paragraph Completion Task
Dongyeop Kang
Eduard H. Hovy
LRM
42
24
0
11 Oct 2020
Knowledge-Enriched Distributional Model Inversion Attacks
Si-An Chen
Mostafa Kahla
R. Jia
Guo-Jun Qi
24
93
0
08 Oct 2020
Learning to Recombine and Resample Data for Compositional Generalization
Ekin Akyürek
Afra Feyza Akyürek
Jacob Andreas
29
79
0
08 Oct 2020
A Mathematical Exploration of Why Language Models Help Solve Downstream Tasks
Nikunj Saunshi
Sadhika Malladi
Sanjeev Arora
22
87
0
07 Oct 2020
HeteroFL: Computation and Communication Efficient Federated Learning for Heterogeneous Clients
Enmao Diao
Jie Ding
Vahid Tarokh
FedML
46
546
0
03 Oct 2020
Data-Efficient Pretraining via Contrastive Self-Supervision
Nils Rethmeier
Isabelle Augenstein
28
20
0
02 Oct 2020
Improving Low Compute Language Modeling with In-Domain Embedding Initialisation
Charles F Welch
Rada Mihalcea
Jonathan K. Kummerfeld
AI4CE
19
4
0
29 Sep 2020
Multi-timescale Representation Learning in LSTM Language Models
Shivangi Mahto
Vy A. Vo
Javier S. Turek
Alexander G. Huth
15
29
0
27 Sep 2020
Automated Source Code Generation and Auto-completion Using Deep Learning: Comparing and Discussing Current Language-Model-Related Approaches
Juan Cruz-Benito
Sanjay Vishwakarma
Francisco Martín-Fernández
Ismael Faro Ibm Quantum
22
30
0
16 Sep 2020
Efficient Transformers: A Survey
Yi Tay
Mostafa Dehghani
Dara Bahri
Donald Metzler
VLM
114
1,103
0
14 Sep 2020
Cluster-Former: Clustering-based Sparse Transformer for Long-Range Dependency Encoding
Shuohang Wang
Luowei Zhou
Zhe Gan
Yen-Chun Chen
Yuwei Fang
S. Sun
Yu Cheng
Jingjing Liu
43
28
0
13 Sep 2020
Probabilistic Predictions of People Perusing: Evaluating Metrics of Language Model Performance for Psycholinguistic Modeling
Sophie Hao
S. Mendelsohn
Rachel Sterneck
Randi Martinez
Robert Frank
19
46
0
08 Sep 2020
Adversarial Watermarking Transformer: Towards Tracing Text Provenance with Data Hiding
Sahar Abdelnabi
Mario Fritz
WaLM
28
44
0
07 Sep 2020
Stochastic Normalized Gradient Descent with Momentum for Large-Batch Training
Shen-Yi Zhao
Chang-Wei Shi
Yin-Peng Xie
Wu-Jun Li
ODL
26
8
0
28 Jul 2020
FTRANS: Energy-Efficient Acceleration of Transformers using FPGA
Bingbing Li
Santosh Pandey
Haowen Fang
Yanjun Lyv
Ji Li
Jieyang Chen
Mimi Xie
Lipeng Wan
Hang Liu
Caiwen Ding
AI4CE
16
170
0
16 Jul 2020
Hopfield Networks is All You Need
Hubert Ramsauer
Bernhard Schafl
Johannes Lehner
Philipp Seidl
Michael Widrich
...
David P. Kreil
Michael K Kopp
Günter Klambauer
Johannes Brandstetter
Sepp Hochreiter
24
415
0
16 Jul 2020
A Survey of Privacy Attacks in Machine Learning
M. Rigaki
Sebastian Garcia
PILM
AAML
39
213
0
15 Jul 2020
Term Revealing: Furthering Quantization at Run Time on Quantized DNNs
H. T. Kung
Bradley McDanel
S. Zhang
MQ
21
9
0
13 Jul 2020
Climbing the WOL: Training for Cheaper Inference
Zichang Liu
Zhaozhuo Xu
A. Ji
Jonathan Li
Beidi Chen
Anshumali Shrivastava
TPM
24
7
0
02 Jul 2020
Direct Feedback Alignment Scales to Modern Deep Learning Tasks and Architectures
Julien Launay
Iacopo Poli
Franccois Boniface
Florent Krzakala
41
63
0
23 Jun 2020
Extension of Direct Feedback Alignment to Convolutional and Recurrent Neural Network for Bio-plausible Deep Learning
Donghyeon Han
Gwangtae Park
Junha Ryu
H. Yoo
3DV
15
5
0
23 Jun 2020
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
30
45
0
22 Jun 2020
Learning to Prove from Synthetic Theorems
Eser Aygun
Zafarali Ahmed
Ankit Anand
Vlad Firoiu
Xavier Glorot
Laurent Orseau
Doina Precup
Shibl Mourad
NAI
20
20
0
19 Jun 2020
Categorical Normalizing Flows via Continuous Transformations
Phillip Lippe
E. Gavves
BDL
21
43
0
17 Jun 2020
NAS-Bench-NLP: Neural Architecture Search Benchmark for Natural Language Processing
Nikita Klyuchnikov
I. Trofimov
Ekaterina Artemova
Mikhail Salnikov
M. Fedorov
Evgeny Burnaev
VLM
21
101
0
12 Jun 2020
On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines
Marius Mosbach
Maksym Andriushchenko
Dietrich Klakow
31
352
0
08 Jun 2020
Copy that! Editing Sequences by Copying Spans
Sheena Panthaplackel
Miltiadis Allamanis
Marc Brockschmidt
BDL
21
28
0
08 Jun 2020
Linformer: Self-Attention with Linear Complexity
Sinong Wang
Belinda Z. Li
Madian Khabsa
Han Fang
Hao Ma
66
1,651
0
08 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
64
2,626
0
05 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
39
275
0
01 Jun 2020
Contextualizing ASR Lattice Rescoring with Hybrid Pointer Network Language Model
Da-Rong Liu
Chunxi Liu
Frank Zhang
Gabriel Synnaeve
Yatharth Saraf
Geoffrey Zweig
28
19
0
15 May 2020
A Tale of Two Perplexities: Sensitivity of Neural Language Models to Lexical Retrieval Deficits in Dementia of the Alzheimer's Type
T. Cohen
Serguei V. S. Pakhomov
22
25
0
07 May 2020
DQI: Measuring Data Quality in NLP
Swaroop Mishra
Anjana Arunkumar
Bhavdeep Singh Sachdeva
Chris Bryan
Chitta Baral
36
30
0
02 May 2020
BERT-kNN: Adding a kNN Search Component to Pretrained Language Models for Better QA
Nora Kassner
Hinrich Schütze
RALM
21
68
0
02 May 2020
Dynamic Sampling and Selective Masking for Communication-Efficient Federated Learning
Shaoxiong Ji
Wenqi Jiang
A. Walid
Xue Li
FedML
28
66
0
21 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
ReZero is All You Need: Fast Convergence at Large Depth
Thomas C. Bachlechner
Bodhisattwa Prasad Majumder
H. H. Mao
G. Cottrell
Julian McAuley
AI4CE
30
276
0
10 Mar 2020
Temporal Convolutional Attention-based Network For Sequence Modeling
Hongyan Hao
Yan Wang
Siqiao Xue
Yudi Xia
Jian Zhao
S. Furao
30
41
0
28 Feb 2020
Statistical Adaptive Stochastic Gradient Methods
Pengchuan Zhang
Hunter Lang
Qiang Liu
Lin Xiao
ODL
15
11
0
25 Feb 2020
Limits of Detecting Text Generated by Large-Scale Language Models
L. Varshney
N. Keskar
R. Socher
DeLMO
21
18
0
09 Feb 2020
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding
Sean Welleck
Ilia Kulikov
Jaedeok Kim
Richard Yuanzhe Pang
Kyunghyun Cho
17
65
0
06 Feb 2020
Single Headed Attention RNN: Stop Thinking With Your Head
Stephen Merity
27
68
0
26 Nov 2019
Previous
1
2
3
...
11
12
13
14
Next