Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 1,456 papers shown
Title
Stabilizing Transformers for Reinforcement Learning
Emilio Parisotto
H. F. Song
Jack W. Rae
Razvan Pascanu
Çağlar Gülçehre
...
Aidan Clark
Seb Noury
M. Botvinick
N. Heess
R. Hadsell
OffRL
22
360
0
13 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
22
660
0
12 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
11
110
0
09 Oct 2019
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
33
178
0
08 Oct 2019
BERT for Evidence Retrieval and Claim Verification
Shrishti Saha Shetu
Christof Monz
E. Mabande
RALM
23
120
0
07 Oct 2019
MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
Ning Lu
Wenwen Yu
Xianbiao Qi
Yihao Chen
Ping Gong
Rong Xiao
Xiang Bai
30
157
0
07 Oct 2019
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Subhabrata Mukherjee
Ahmed Hassan Awadallah
26
25
0
04 Oct 2019
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Kyu Jeong Han
R. Prieto
Kaixing(Kai) Wu
T. Ma
12
69
0
01 Oct 2019
MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension
Di Jin
Shuyang Gao
Jiun-Yu Kao
Tagyoung Chung
Dilek Z. Hakkani-Tür
29
69
0
01 Oct 2019
Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture
Ashok Thillaisundaram
Theodosia Togia
24
17
0
26 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
112
6,380
0
26 Sep 2019
Extremely Small BERT Models from Mixed-Vocabulary Training
Sanqiang Zhao
Raghav Gupta
Yang Song
Denny Zhou
VLM
14
53
0
25 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
43
584
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
249
208
0
25 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
22
251
0
23 Sep 2019
Cross-Lingual Natural Language Generation via Pre-Training
Zewen Chi
Li Dong
Furu Wei
Wenhui Wang
Xian-Ling Mao
Heyan Huang
27
136
0
23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
27
174
0
23 Sep 2019
Representation Learning for Electronic Health Records
W. Weng
Peter Szolovits
36
19
0
19 Sep 2019
Summary Level Training of Sentence Rewriting for Abstractive Summarization
Sanghwan Bae
Taeuk Kim
Jihoon Kim
Sang-goo Lee
38
68
0
19 Sep 2019
Language models and Automated Essay Scoring
Pedro Uría Rodríguez
Amir Jafari
C. Ormerod
30
82
0
18 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
231
777
0
17 Sep 2019
Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations
Sawyer Birnbaum
Volodymyr Kuleshov
S. Enam
Pang Wei Koh
Stefano Ermon
AI4TS
24
68
0
14 Sep 2019
SANVis: Visual Analytics for Understanding Self-Attention Networks
Cheonbok Park
Inyoup Na
Yongjang Jo
Sungbok Shin
J. Yoo
Bum Chul Kwon
Jian Zhao
Hyungjong Noh
Yeonsoo Lee
Jaegul Choo
HAI
35
38
0
13 Sep 2019
Frustratingly Easy Natural Question Answering
Lin Pan
Rishav Chakravarti
Anthony Ferritto
Michael R. Glass
A. Gliozzo
Salim Roukos
Radu Florian
Avirup Sil
24
14
0
11 Sep 2019
Span Selection Pre-training for Question Answering
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
38
70
0
09 Sep 2019
Graph-Based Reasoning over Heterogeneous External Knowledge for Commonsense Question Answering
Shangwen Lv
Daya Guo
Jingjing Xu
Duyu Tang
Nan Duan
Ming Gong
Linjun Shou
Daxin Jiang
Guihong Cao
Songlin Hu
RALM
15
202
0
09 Sep 2019
Reasoning Over Semantic-Level Graph for Fact Checking
Wanjun Zhong
Jingjing Xu
Duyu Tang
Zenan Xu
Nan Duan
M. Zhou
Jiahai Wang
Jian Yin
HILM
GNN
185
166
0
09 Sep 2019
Context-aware Deep Model for Entity Recommendation in Search Engine at Alibaba
Qianghuai Jia
Ningyu Zhang
Nengwei Hua
19
5
0
06 Sep 2019
Semantics-aware BERT for Language Understanding
ZhuoSheng Zhang
Yuwei Wu
Zhao Hai
Z. Li
Shuailiang Zhang
Xi Zhou
Xiang Zhou
21
365
0
05 Sep 2019
KagNet: Knowledge-Aware Graph Networks for Commonsense Reasoning
Bill Yuchen Lin
Xinyue Chen
Jamin Chen
Xiang Ren
24
460
0
04 Sep 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
87
11,768
0
27 Aug 2019
Patient Knowledge Distillation for BERT Model Compression
S. Sun
Yu Cheng
Zhe Gan
Jingjing Liu
78
831
0
25 Aug 2019
VL-BERT: Pre-training of Generic Visual-Linguistic Representations
Weijie Su
Xizhou Zhu
Yue Cao
Bin Li
Lewei Lu
Furu Wei
Jifeng Dai
VLM
MLLM
SSL
82
1,651
0
22 Aug 2019
Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models
Zhiquan Ye
Qian Chen
Wen Wang
Zhenhua Ling
27
68
0
19 Aug 2019
Unicoder-VL: A Universal Encoder for Vision and Language by Cross-modal Pre-training
Gen Li
Nan Duan
Yuejian Fang
Ming Gong
Daxin Jiang
Ming Zhou
SSL
VLM
MLLM
89
895
0
16 Aug 2019
Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding
Oren Barkan
Noam Razin
Itzik Malkiel
Ori Katz
Avi Caciularu
Noam Koenigstein
FedML
25
37
0
14 Aug 2019
On Identifiability in Transformers
Gino Brunner
Yang Liu
Damian Pascual
Oliver Richter
Massimiliano Ciaramita
Roger Wattenhofer
ViT
30
186
0
12 Aug 2019
DuTongChuan: Context-aware Translation Model for Simultaneous Interpreting
Hao Xiong
Ruiqing Zhang
Chuanqiang Zhang
Zhongjun He
Hua Wu
Haifeng Wang
41
25
0
30 Jul 2019
Leveraging Pre-trained Checkpoints for Sequence Generation Tasks
S. Rothe
Shashi Narayan
Aliaksei Severyn
SILM
71
433
0
29 Jul 2019
DLGNet: A Transformer-based Model for Dialogue Response Generation
O. Olabiyi
Erik T. Mueller
16
12
0
26 Jul 2019
What is this Article about? Extreme Summarization with Topic-aware Convolutional Neural Networks
Shashi Narayan
Shay B. Cohen
Mirella Lapata
AILaw
31
18
0
19 Jul 2019
DeepTrax: Embedding Graphs of Financial Transactions
C. Bayan Bruss
Anish Khazane
Jonathan Rider
R. Serpe
Antonia Gogoglou
Keegan E. Hines
AIFin
GNN
32
43
0
16 Jul 2019
Head-Driven Phrase Structure Grammar Parsing on Penn Treebank
Junru Zhou
Zhao Hai
42
144
0
05 Jul 2019
A Review of Keyphrase Extraction
Eirini Papagiannopoulou
Grigorios Tsoumakas
21
166
0
13 May 2019
Deep Unsupervised Cardinality Estimation
Zongheng Yang
Eric Liang
Amog Kamsetty
Chenggang Wu
Yan Duan
Peter Chen
Pieter Abbeel
J. M. Hellerstein
S. Krishnan
Ion Stoica
27
203
0
10 May 2019
Taming Pretrained Transformers for Extreme Multi-label Text Classification
Wei-Cheng Chang
Hsiang-Fu Yu
Kai Zhong
Yiming Yang
Inderjit Dhillon
25
20
0
07 May 2019
Terminologies augmented recurrent neural network model for clinical named entity recognition
Ivan Lerner
N. Paris
Xavier Tannier
49
37
0
25 Apr 2019
DocBERT: BERT for Document Classification
Ashutosh Adhikari
Achyudh Ram
Raphael Tang
Jimmy J. Lin
LLMAG
VLM
13
296
0
17 Apr 2019
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
Shane Storks
Qiaozi Gao
J. Chai
21
128
0
02 Apr 2019
Dual Co-Matching Network for Multi-choice Reading Comprehension
Shuailiang Zhang
Zhao Hai
Yuwei Wu
ZhuoSheng Zhang
Xi Zhou
Xiaoping Zhou
39
131
0
27 Jan 2019
Previous
1
2
3
...
28
29
30
Next