Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 1,486 papers shown
Title
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
59
953
0
12 Feb 2020
Feature Importance Estimation with Self-Attention Networks
Blaž Škrlj
S. Džeroski
Nada Lavrac
Matej Petković
FAtt
MILM
34
51
0
11 Feb 2020
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Weihao Yu
Zihang Jiang
Yanfei Dong
Jiashi Feng
LRM
25
245
0
11 Feb 2020
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha Singh
Nirmalya Roy
A. Gangopadhyay
29
6
0
10 Feb 2020
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng Chang
Felix X. Yu
Yin-Wen Chang
Yiming Yang
Sanjiv Kumar
RALM
13
301
0
10 Feb 2020
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir Raviv
Avi Caciularu
Tomer Raviv
Jacob Goldberger
Yair Be’ery
26
8
0
06 Feb 2020
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
Ruize Wang
Duyu Tang
Nan Duan
Zhongyu Wei
Xuanjing Huang
Jianshu Ji
Guihong Cao
Daxin Jiang
Ming Zhou
KELM
48
545
0
05 Feb 2020
Schema-Guided Dialogue State Tracking Task at DSTC8
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
Pranav Khaitan
27
41
0
02 Feb 2020
Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
Taeuk Kim
Jihun Choi
Daniel Edmiston
Sang-goo Lee
22
90
0
30 Jan 2020
Retrospective Reader for Machine Reading Comprehension
ZhuoSheng Zhang
Junjie Yang
Hai Zhao
RALM
25
226
0
27 Jan 2020
Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus
Bang Liu
Haojie Wei
Di Niu
Haolan Chen
Yancheng He
25
92
0
27 Jan 2020
DUMA: Reading Comprehension with Transposition Thinking
Pengfei Zhu
Hai Zhao
Xiaoguang Li
AI4CE
36
35
0
26 Jan 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
61
1,777
0
22 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
40
259
0
22 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
258
1,591
0
21 Jan 2020
A multimodal deep learning approach for named entity recognition from social media
M. Asgari-Chenaghlu
M. Feizi-Derakhshi
Leili Farzinvash
M. Balafar
C. Motamed
19
28
0
19 Jan 2020
A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings
G. R. Claramunt
Rodrigo Agerri
German Rigau
35
7
0
17 Jan 2020
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Liang Xu
Yu Tong
Qianqian Dong
Yixuan Liao
Cong Yu
Yin Tian
Weitang Liu
Lu Li
Caiquan Liu
Xuanwei Zhang
32
48
0
13 Jan 2020
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
Weizhen Qi
Yu Yan
Yeyun Gong
Dayiheng Liu
Nan Duan
Jiusheng Chen
Ruofei Zhang
Ming Zhou
AI4TS
27
446
0
13 Jan 2020
An Exploration of Embodied Visual Exploration
Santhosh Kumar Ramakrishnan
Dinesh Jayaraman
Kristen Grauman
LM&Ro
37
98
0
07 Jan 2020
Exploring Benefits of Transfer Learning in Neural Machine Translation
Tom Kocmi
32
17
0
06 Jan 2020
oLMpics -- On what Language Model Pre-training Captures
Alon Talmor
Yanai Elazar
Yoav Goldberg
Jonathan Berant
LRM
34
300
0
31 Dec 2019
Multi-Graph Transformer for Free-Hand Sketch Recognition
Peng Xu
Chaitanya K. Joshi
Xavier Bresson
ViT
25
85
0
24 Dec 2019
A Multimodal Target-Source Classifier with Attention Branches to Understand Ambiguous Instructions for Fetching Daily Objects
A. Magassouba
K. Sugiura
Hisashi Kawai
38
9
0
23 Dec 2019
Are Transformers universal approximators of sequence-to-sequence functions?
Chulhee Yun
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
26
336
0
20 Dec 2019
End-to-end Named Entity Recognition and Relation Extraction using Pre-trained Language Models
John Giorgi
Xindi Wang
Nicola Sahar
W. Shin
Gary D. Bader
Bo Wang
11
36
0
20 Dec 2019
Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model
Wenhan Xiong
Jingfei Du
William Yang Wang
Veselin Stoyanov
SSL
KELM
52
201
0
20 Dec 2019
PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization
Jingqing Zhang
Yao-Min Zhao
Mohammad Saleh
Peter J. Liu
RALM
3DGS
72
2,019
0
18 Dec 2019
Cross-Lingual Ability of Multilingual BERT: An Empirical Study
Karthikeyan K
Zihan Wang
Stephen D. Mayhew
Dan Roth
LRM
36
333
0
17 Dec 2019
FlauBERT: Unsupervised Language Model Pre-training for French
Hang Le
Loïc Vial
Jibril Frej
Vincent Segonne
Maximin Coavoux
Benjamin Lecouteux
A. Allauzen
Benoît Crabbé
Laurent Besacier
D. Schwab
AI4CE
49
395
0
11 Dec 2019
Zero-shot Text Classification With Generative Language Models
Raul Puri
Bryan Catanzaro
VLM
24
104
0
10 Dec 2019
Large-scale Pretraining for Visual Dialog: A Simple State-of-the-Art Baseline
Vishvak Murahari
Dhruv Batra
Devi Parikh
Abhishek Das
VLM
23
115
0
05 Dec 2019
An Exploration of Data Augmentation and Sampling Techniques for Domain-Agnostic Question Answering
Shayne Longpre
Yi Lu
Zhucheng Tu
Christopher DuBois
27
70
0
04 Dec 2019
Inducing Relational Knowledge from BERT
Zied Bouraoui
Jose Camacho-Collados
Steven Schockaert
29
166
0
28 Nov 2019
FairyTED: A Fair Rating Predictor for TED Talk Data
Rupam Acharyya
Shouman Das
Ankani Chattoraj
Md. Iftekhar Tanveer
27
12
0
25 Nov 2019
End-to-End Trainable Non-Collaborative Dialog System
Yu Li
Kun Qian
Weiyan Shi
Zhou Yu
21
45
0
25 Nov 2019
Invenio: Discovering Hidden Relationships Between Tasks/Domains Using Structured Meta Learning
Sameeksha Katoch
Kowshik Thopalli
Jayaraman J. Thiagarajan
Pavan Turaga
A. Spanias
11
4
0
24 Nov 2019
A Transformer-based approach to Irony and Sarcasm detection
Rolandos Alexandros Potamias
Georgios Siolas
A. Stafylopatis
33
206
0
23 Nov 2019
Factorized Multimodal Transformer for Multimodal Sequential Learning
Amir Zadeh
Chengfeng Mao
Kelly Shi
Yiwei Zhang
Paul Pu Liang
Soujanya Poria
Louis-Philippe Morency
25
44
0
22 Nov 2019
Global Greedy Dependency Parsing
Z. Li
Zhao Hai
Kevin Parnow
31
31
0
20 Nov 2019
The Eighth Dialog System Technology Challenge
Seokhwan Kim
Michel Galley
Chulaka Gunasekara
Sungjin Lee
Adam Atkinson
...
Tim K. Marks
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
VLM
27
65
0
14 Nov 2019
What do you mean, BERT? Assessing BERT as a Distributional Semantics Model
Timothee Mickus
Denis Paperno
Mathieu Constant
Kees van Deemter
29
45
0
13 Nov 2019
Compressive Transformers for Long-Range Sequence Modelling
Jack W. Rae
Anna Potapenko
Siddhant M. Jayakumar
Timothy Lillicrap
RALM
VLM
KELM
13
623
0
13 Nov 2019
Attending to Entities for Better Text Understanding
Pengxiang Cheng
K. Erk
LRM
24
37
0
11 Nov 2019
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
42
956
0
10 Nov 2019
Rethinking Self-Attention: Towards Interpretability in Neural Parsing
Khalil Mrini
Franck Dernoncourt
Quan Tran
Trung Bui
W. Chang
Ndapandula Nakashole
MILM
LRM
18
29
0
10 Nov 2019
E-BERT: Efficient-Yet-Effective Entity Embeddings for BERT
Nina Poerner
Ulli Waltinger
Hinrich Schütze
20
156
0
09 Nov 2019
Improving Machine Reading Comprehension via Adversarial Training
Ziqing Yang
Yiming Cui
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
27
17
0
09 Nov 2019
How Decoding Strategies Affect the Verifiability of Generated Text
Luca Massarelli
Fabio Petroni
Aleksandra Piktus
Myle Ott
Tim Rocktaschel
Vassilis Plachouras
Fabrizio Silvestri
Sebastian Riedel
23
50
0
09 Nov 2019
On the Relationship between Self-Attention and Convolutional Layers
Jean-Baptiste Cordonnier
Andreas Loukas
Martin Jaggi
45
524
0
08 Nov 2019
Previous
1
2
3
...
27
28
29
30
Next