ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1609.07843
  4. Cited By
Pointer Sentinel Mixture Models

Pointer Sentinel Mixture Models

26 September 2016
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
    RALM
ArXivPDFHTML

Papers citing "Pointer Sentinel Mixture Models"

50 / 706 papers shown
Title
Temporal Convolutional Attention-based Network For Sequence Modeling
Temporal Convolutional Attention-based Network For Sequence Modeling
Hongyan Hao
Yan Wang
Siqiao Xue
Yudi Xia
Jian Zhao
S. Furao
30
41
0
28 Feb 2020
Statistical Adaptive Stochastic Gradient Methods
Statistical Adaptive Stochastic Gradient Methods
Pengchuan Zhang
Hunter Lang
Qiang Liu
Lin Xiao
ODL
15
11
0
25 Feb 2020
Limits of Detecting Text Generated by Large-Scale Language Models
Limits of Detecting Text Generated by Large-Scale Language Models
Lav Varshney
N. Keskar
R. Socher
DeLMO
21
18
0
09 Feb 2020
On the distance between two neural networks and the stability of
  learning
On the distance between two neural networks and the stability of learning
Jeremy Bernstein
Arash Vahdat
Yisong Yue
Xuan Li
ODL
200
57
0
09 Feb 2020
Consistency of a Recurrent Language Model With Respect to Incomplete
  Decoding
Consistency of a Recurrent Language Model With Respect to Incomplete Decoding
Sean Welleck
Ilia Kulikov
Jaedeok Kim
Richard Yuanzhe Pang
Kyunghyun Cho
17
65
0
06 Feb 2020
Low-Complexity LSTM Training and Inference with FloatSD8 Weight
  Representation
Low-Complexity LSTM Training and Inference with FloatSD8 Weight Representation
Yu-Tung Liu
T. Chiueh
MQ
23
1
0
23 Jan 2020
Single Headed Attention RNN: Stop Thinking With Your Head
Single Headed Attention RNN: Stop Thinking With Your Head
Stephen Merity
27
68
0
26 Nov 2019
Compressive Transformers for Long-Range Sequence Modelling
Compressive Transformers for Long-Range Sequence Modelling
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALM
VLM
KELM
13
623
0
13 Nov 2019
Improving Transformer Models by Reordering their Sublayers
Improving Transformer Models by Reordering their Sublayers
Ofir Press
Noah A. Smith
Omer Levy
22
87
0
10 Nov 2019
Generalization through Memorization: Nearest Neighbor Language Models
Generalization through Memorization: Nearest Neighbor Language Models
Urvashi Khandelwal
Omer Levy
Dan Jurafsky
Luke Zettlemoyer
M. Lewis
RALM
71
817
0
01 Nov 2019
On Generalization Bounds of a Family of Recurrent Neural Networks
On Generalization Bounds of a Family of Recurrent Neural Networks
Minshuo Chen
Xingguo Li
T. Zhao
19
70
0
28 Oct 2019
FineText: Text Classification via Attention-based Language Model
  Fine-tuning
FineText: Text Classification via Attention-based Language Model Fine-tuning
Yunzhe Tao
Saurabh Gupta
Satyapriya Krishna
Xiong Zhou
Orchid Majumder
Vineet Khare
21
3
0
25 Oct 2019
Localization of Fake News Detection via Multitask Transfer Learning
Localization of Fake News Detection via Multitask Transfer Learning
Jan Christian Blaise Cruz
Julianne Agatha Tan
C. Cheng
25
33
0
21 Oct 2019
Improving Sequence Modeling Ability of Recurrent Neural Networks via
  Sememes
Improving Sequence Modeling Ability of Recurrent Neural Networks via Sememes
Yujia Qin
Fanchao Qi
Sicong Ouyang
Zhiyuan Liu
Cheng Yang
Yasheng Wang
Qun Liu
Maosong Sun
28
5
0
20 Oct 2019
Using a KG-Copy Network for Non-Goal Oriented Dialogues
Using a KG-Copy Network for Non-Goal Oriented Dialogues
Debanjan Chaudhuri
Md. Rony
Simon Jordan
Jens Lehmann
27
12
0
17 Oct 2019
Searching for A Robust Neural Architecture in Four GPU Hours
Searching for A Robust Neural Architecture in Four GPU Hours
Xuanyi Dong
Yezhou Yang
20
647
0
10 Oct 2019
Kernel-Based Approaches for Sequence Modeling: Connections to Neural
  Methods
Kernel-Based Approaches for Sequence Modeling: Connections to Neural Methods
Kevin J Liang
Guoyin Wang
Yitong Li
Ricardo Henao
Lawrence Carin
30
2
0
09 Oct 2019
Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge
  Transfer Networks
Data-Efficient Goal-Oriented Conversation with Dialogue Knowledge Transfer Networks
Yuchang Sun
Sungjin Lee
Yaliang Li
Jun Zhang
24
11
0
03 Oct 2019
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic
  Knowledge Graphs
DyKgChat: Benchmarking Dialogue Generation Grounding on Dynamic Knowledge Graphs
Yi-Lin Tuan
Yun-Nung Chen
Hung-yi Lee
21
71
0
01 Oct 2019
A Constructive Prediction of the Generalization Error Across Scales
A Constructive Prediction of the Generalization Error Across Scales
Jonathan S. Rosenfeld
Amir Rosenfeld
Yonatan Belinkov
Nir Shavit
36
207
0
27 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
43
584
0
25 Sep 2019
Alleviating Sequence Information Loss with Data Overlapping and Prime
  Batch Sizes
Alleviating Sequence Information Loss with Data Overlapping and Prime Batch Sizes
Noémien Kocher
Christian Scuito
Lorenzo Tarantino
Alexandros Lazaridis
Andreas Fischer
C. Musat
28
0
0
18 Sep 2019
PaLM: A Hybrid Parser and Language Model
PaLM: A Hybrid Parser and Language Model
Hao Peng
Roy Schwartz
Noah A. Smith
AIMat
23
15
0
04 Sep 2019
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
454
2,592
0
03 Sep 2019
Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset
Taskmaster-1: Toward a Realistic and Diverse Dialog Dataset
Bill Byrne
Karthikeyan K
Chinnadhurai Sankar
Arvind Neelakantan
Daniel Duckworth
Semih Yavuz
Ben Goodrich
Amit Dubey
A. Cedilnik
Kyu-Young Kim
18
215
0
01 Sep 2019
On the Effectiveness of Low-Rank Matrix Factorization for LSTM Model
  Compression
On the Effectiveness of Low-Rank Matrix Factorization for LSTM Model Compression
Genta Indra Winata
Andrea Madotto
Jamin Shin
Elham J. Barezi
Pascale Fung
27
28
0
27 Aug 2019
Recurrent Neural Networks: An Embedded Computing Perspective
Recurrent Neural Networks: An Embedded Computing Perspective
Nesma M. Rezk
M. Purnaprajna
Tomas Nordstrom
Z. Ul-Abdin
43
81
0
23 Jul 2019
Techniques for Automated Machine Learning
Techniques for Automated Machine Learning
Yi-Wei Chen
Qingquan Song
Xia Hu
18
48
0
21 Jul 2019
Few-Shot Representation Learning for Out-Of-Vocabulary Words
Few-Shot Representation Learning for Out-Of-Vocabulary Words
Ziniu Hu
Ting-Li Chen
Kai-Wei Chang
Yizhou Sun
24
76
0
01 Jul 2019
Evaluating Computational Language Models with Scaling Properties of
  Natural Language
Evaluating Computational Language Models with Scaling Properties of Natural Language
Shuntaro Takahashi
Kumiko Tanaka-Ishii
16
23
0
22 Jun 2019
Barack's Wife Hillary: Using Knowledge-Graphs for Fact-Aware Language
  Modeling
Barack's Wife Hillary: Using Knowledge-Graphs for Fact-Aware Language Modeling
IV RobertL.Logan
Nelson F. Liu
Matthew E. Peters
Matt Gardner
Sameer Singh
RALM
25
186
0
17 Jun 2019
Attention-based Modeling for Emotion Detection and Classification in
  Textual Conversations
Attention-based Modeling for Emotion Detection and Classification in Textual Conversations
Waleed Ragheb
J. Azé
S. Bringay
Maximilien Servajean
24
25
0
14 Jun 2019
Improving Neural Language Modeling via Adversarial Training
Improving Neural Language Modeling via Adversarial Training
Dilin Wang
Chengyue Gong
Qiang Liu
AAML
43
115
0
10 Jun 2019
A Lightweight Recurrent Network for Sequence Modeling
A Lightweight Recurrent Network for Sequence Modeling
Biao Zhang
Rico Sennrich
27
7
0
30 May 2019
Stochastic Gradient Methods with Layer-wise Adaptive Moments for
  Training of Deep Networks
Stochastic Gradient Methods with Layer-wise Adaptive Moments for Training of Deep Networks
Boris Ginsburg
P. Castonguay
Oleksii Hrinchuk
Oleksii Kuchaiev
Vitaly Lavrukhin
Ryan Leary
Jason Chun Lok Li
Huyen Nguyen
Yang Zhang
Jonathan M. Cohen
ODL
25
13
0
27 May 2019
Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads
  Identification and Resolution
Where's My Head? Definition, Dataset and Models for Numeric Fused-Heads Identification and Resolution
Yanai Elazar
Yoav Goldberg
19
23
0
26 May 2019
Training language GANs from Scratch
Training language GANs from Scratch
Cyprien de Masson dÁutume
Mihaela Rosca
Jack W. Rae
S. Mohamed
GAN
SyDa
9
87
0
23 May 2019
A framework for the extraction of Deep Neural Networks by leveraging
  public data
A framework for the extraction of Deep Neural Networks by leveraging public data
Soham Pal
Yash Gupta
Aditya Shukla
Aditya Kanade
S. Shevade
V. Ganapathy
FedML
MLAU
MIACV
36
56
0
22 May 2019
AMR Parsing as Sequence-to-Graph Transduction
AMR Parsing as Sequence-to-Graph Transduction
Sheng Zhang
Xutai Ma
Kevin Duh
Benjamin Van Durme
33
148
0
21 May 2019
Adaptively Truncating Backpropagation Through Time to Control Gradient
  Bias
Adaptively Truncating Backpropagation Through Time to Control Gradient Bias
Christopher Aicher
N. Foti
E. Fox
MQ
30
32
0
17 May 2019
Probing What Different NLP Tasks Teach Machines about Function Word
  Comprehension
Probing What Different NLP Tasks Teach Machines about Function Word Comprehension
Najoung Kim
Roma Patel
Adam Poliak
Alex Jinpeng Wang
Patrick Xia
...
Alexis Ross
Tal Linzen
Benjamin Van Durme
Samuel R. Bowman
Ellie Pavlick
28
106
0
25 Apr 2019
Language Models with Transformers
Language Models with Transformers
Chenguang Wang
Mu Li
Alex Smola
17
121
0
20 Apr 2019
Pun Generation with Surprise
Pun Generation with Surprise
He He
Nanyun Peng
Percy Liang
33
69
0
15 Apr 2019
Knowledge Distillation For Recurrent Neural Network Language Modeling
  With Trust Regularization
Knowledge Distillation For Recurrent Neural Network Language Modeling With Trust Regularization
Yangyang Shi
M. Hwang
X. Lei
Haoyu Sheng
34
25
0
08 Apr 2019
Identifying and Reducing Gender Bias in Word-Level Language Models
Identifying and Reducing Gender Bias in Word-Level Language Models
Shikha Bordia
Samuel R. Bowman
FaML
48
323
0
05 Apr 2019
Low Resource Text Classification with ULMFit and Backtranslation
Low Resource Text Classification with ULMFit and Backtranslation
Sam Shleifer
VLM
19
57
0
21 Mar 2019
Asynchronous Federated Optimization
Asynchronous Federated Optimization
Cong Xie
Oluwasanmi Koyejo
Indranil Gupta
FedML
29
562
0
10 Mar 2019
Context Vectors are Reflections of Word Vectors in Half the Dimensions
Context Vectors are Reflections of Word Vectors in Half the Dimensions
Z. Assylbekov
Rustem Takhanov
16
10
0
26 Feb 2019
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise
  Non-linearities
Breaking the Softmax Bottleneck via Learnable Monotonic Pointwise Non-linearities
O. Ganea
Sylvain Gelly
Gary Bécigneul
Aliaksei Severyn
29
18
0
21 Feb 2019
Learning to Adaptively Scale Recurrent Neural Networks
Learning to Adaptively Scale Recurrent Neural Networks
Hao Hu
Liqiang Wang
Guo-Jun Qi
AI4CE
23
9
0
15 Feb 2019
Previous
123...12131415
Next