Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
v1
v2 (latest)
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 3,518 papers shown
Title
Diversify Your Datasets: Analyzing Generalization via Controlled Variance in Adversarial Datasets
Ohad Rozen
Vered Shwartz
Roee Aharoni
Ido Dagan
AAML
90
38
0
21 Oct 2019
Keyphrase Extraction from Scholarly Articles as Sequence Labeling using Contextualized Embeddings
Dhruva Sahrawat
Debanjan Mahata
Mayank Kulkarni
Haimin Zhang
Rakesh Gosangi
Amanda Stent
Agniv Sharma
Yaman Kumar Singla
R. Shah
Roger Zimmermann
38
30
0
19 Oct 2019
A Mutual Information Maximization Perspective of Language Representation Learning
Lingpeng Kong
Cyprien de Masson dÁutume
Wang Ling
Lei Yu
Zihang Dai
Dani Yogatama
SSL
284
167
0
18 Oct 2019
PLATO: Pre-trained Dialogue Generation Model with Discrete Latent Variable
Siqi Bao
H. He
Fan Wang
Hua Wu
Haifeng Wang
91
271
0
17 Oct 2019
BIG MOOD: Relating Transformers to Explicit Commonsense Knowledge
Jeff Da
24
0
0
17 Oct 2019
Evolution of transfer learning in natural language processing
Aditya Malte
Pratik Ratadiya
63
54
0
16 Oct 2019
BERTRAM: Improved Word Embeddings Have Big Impact on Contextualized Model Performance
Timo Schick
Hinrich Schütze
83
50
0
16 Oct 2019
Structured Pruning of a BERT-based Question Answering Model
J. Scott McCarley
Rishav Chakravarti
Avirup Sil
98
53
0
14 Oct 2019
Q8BERT: Quantized 8Bit BERT
Ofir Zafrir
Guy Boudoukh
Peter Izsak
Moshe Wasserblat
MQ
112
507
0
14 Oct 2019
Stabilizing Transformers for Reinforcement Learning
Emilio Parisotto
H. F. Song
Jack W. Rae
Razvan Pascanu
Çağlar Gülçehre
...
Aidan Clark
Seb Noury
M. Botvinick
N. Heess
R. Hadsell
OffRL
103
368
0
13 Oct 2019
vq-wav2vec: Self-Supervised Learning of Discrete Speech Representations
Alexei Baevski
Steffen Schneider
Michael Auli
SSL
185
668
0
12 Oct 2019
PipeMare: Asynchronous Pipeline Parallel DNN Training
Bowen Yang
Jian Zhang
Jonathan Li
Christopher Ré
Christopher R. Aberger
Christopher De Sa
77
113
0
09 Oct 2019
Knowledge Distillation from Internal Representations
Gustavo Aguilar
Yuan Ling
Yu Zhang
Benjamin Yao
Xing Fan
Edward Guo
106
181
0
08 Oct 2019
BERT for Evidence Retrieval and Claim Verification
Shrishti Saha Shetu
Christof Monz
E. Mabande
RALM
94
126
0
07 Oct 2019
MASTER: Multi-Aspect Non-local Network for Scene Text Recognition
Ning Lu
Wenwen Yu
Xianbiao Qi
Yihao Chen
Ping Gong
Rong Xiao
Xiang Bai
70
158
0
07 Oct 2019
Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data
Subhabrata Mukherjee
Ahmed Hassan Awadallah
91
25
0
04 Oct 2019
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
313
7,575
0
02 Oct 2019
SummAE: Zero-Shot Abstractive Text Summarization using Length-Agnostic Auto-Encoders
Peter J. Liu
Yu-An Chung
Jie Jessie Ren
113
20
0
02 Oct 2019
Exploiting BERT for End-to-End Aspect-based Sentiment Analysis
Xin Li
Lidong Bing
Wenxuan Zhang
W. Lam
83
281
0
02 Oct 2019
State-of-the-Art Speech Recognition Using Multi-Stream Self-Attention With Dilated 1D Convolutions
Kyu Jeong Han
R. Prieto
Kaixing(Kai) Wu
T. Ma
134
70
0
01 Oct 2019
Better Document-Level Machine Translation with Bayes' Rule
Lei Yu
Laurent Sartran
Wojciech Stokowiec
Wang Ling
Lingpeng Kong
Phil Blunsom
Chris Dyer
77
7
0
01 Oct 2019
MMM: Multi-stage Multi-task Learning for Multi-choice Reading Comprehension
Di Jin
Shuyang Gao
Jiun-Yu Kao
Tagyoung Chung
Dilek Z. Hakkani-Tür
78
69
0
01 Oct 2019
TMLab: Generative Enhanced Model (GEM) for adversarial attacks
P. Niewinski
M. Pszona
M. Janicka
VLM
GAN
66
14
0
01 Oct 2019
Biomedical relation extraction with pre-trained language representations and minimal task-specific architecture
Ashok Thillaisundaram
Theodosia Togia
57
17
0
26 Sep 2019
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
504
6,482
0
26 Sep 2019
Aspect and Opinion Term Extraction for Hotel Reviews using Transfer Learning and Auxiliary Labels
Yosef Ardhito Winatmoko
Ali Akbar Septiandri
Arie Pratama Sutiono
11
4
0
26 Sep 2019
Pre-train, Interact, Fine-tune: A Novel Interaction Representation for Text Classification
Jianming Zheng
Fei Cai
Honghui Chen
Maarten de Rijke
61
22
0
26 Sep 2019
FreeLB: Enhanced Adversarial Training for Natural Language Understanding
Chen Zhu
Yu Cheng
Zhe Gan
S. Sun
Tom Goldstein
Jingjing Liu
AAML
296
443
0
25 Sep 2019
UNITER: UNiversal Image-TExt Representation Learning
Yen-Chun Chen
Linjie Li
Licheng Yu
Ahmed El Kholy
Faisal Ahmed
Zhe Gan
Yu Cheng
Jingjing Liu
VLM
OT
134
448
0
25 Sep 2019
Extremely Small BERT Models from Mixed-Vocabulary Training
Sanqiang Zhao
Raghav Gupta
Yang Song
Denny Zhou
VLM
66
53
0
25 Sep 2019
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
130
597
0
25 Sep 2019
Multi-task Batch Reinforcement Learning with Metric Learning
Jiachen Li
Q. Vuong
Shuang Liu
Minghua Liu
K. Ciosek
George Andriopoulos
Henrik I. Christensen
H. Su
OffRL
65
2
0
25 Sep 2019
Mixout: Effective Regularization to Finetune Large-scale Pretrained Language Models
Cheolhyoung Lee
Kyunghyun Cho
Wanmo Kang
MoE
291
209
0
25 Sep 2019
Technical report on Conversational Question Answering
Yingnan Ju
Fubang Zhao
Shijie Chen
Bowen Zheng
Xuefeng Yang
Yunfeng Liu
67
49
0
24 Sep 2019
Portuguese Named Entity Recognition using BERT-CRF
Fábio Souza
Rodrigo Nogueira
R. Lotufo
73
258
0
23 Sep 2019
Cross-Lingual Natural Language Generation via Pre-Training
Zewen Chi
Li Dong
Furu Wei
Wenhui Wang
Xian-Ling Mao
Heyan Huang
101
138
0
23 Sep 2019
Does BERT Make Any Sense? Interpretable Word Sense Disambiguation with Contextualized Embeddings
Gregor Wiedemann
Steffen Remus
Avi Chawla
Chris Biemann
111
176
0
23 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
130
1,881
0
23 Sep 2019
Representation Learning for Electronic Health Records
W. Weng
Peter Szolovits
81
19
0
19 Sep 2019
Summary Level Training of Sentence Rewriting for Abstractive Summarization
Sanghwan Bae
Taeuk Kim
Jihoon Kim
Sang-goo Lee
73
69
0
19 Sep 2019
Language models and Automated Essay Scoring
Pedro Uría Rodríguez
Amir Jafari
C. Ormerod
67
86
0
18 Sep 2019
Extractive Summarization of Long Documents by Combining Global and Local Context
Wen Xiao
Giuseppe Carenini
112
153
0
17 Sep 2019
Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism
Mohammad Shoeybi
M. Patwary
Raul Puri
P. LeGresley
Jared Casper
Bryan Catanzaro
MoE
363
1,926
0
17 Sep 2019
K-BERT: Enabling Language Representation with Knowledge Graph
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Qi Ju
Haotang Deng
Ping Wang
316
796
0
17 Sep 2019
Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations
Sawyer Birnbaum
Volodymyr Kuleshov
S. Enam
Pang Wei Koh
Stefano Ermon
AI4TS
84
70
0
14 Sep 2019
SANVis: Visual Analytics for Understanding Self-Attention Networks
Cheonbok Park
Inyoup Na
Yongjang Jo
Sungbok Shin
J. Yoo
Bum Chul Kwon
Jian Zhao
Hyungjong Noh
Yeonsoo Lee
Jaegul Choo
HAI
70
40
0
13 Sep 2019
Frustratingly Easy Natural Question Answering
Lin Pan
Rishav Chakravarti
Anthony Ferritto
Michael R. Glass
A. Gliozzo
Salim Roukos
Radu Florian
Avirup Sil
59
14
0
11 Sep 2019
Question Generation by Transformers
Kettip Kriangchaivech
A. Wangperawong
48
28
0
09 Sep 2019
Span Selection Pre-training for Question Answering
Michael R. Glass
A. Gliozzo
Rishav Chakravarti
Anthony Ferritto
Lin Pan
G P Shrivatsa Bhargav
Dinesh Garg
Avirup Sil
RALM
97
73
0
09 Sep 2019
Forecaster: A Graph Transformer for Forecasting Spatial and Time-Dependent Data
Yongqian Li
J. M. F. Moura
AI4TS
71
31
0
09 Sep 2019
Previous
1
2
3
...
68
69
70
71
Next