Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
v1
v2 (latest)
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 3,524 papers shown
Title
Distilling Linguistic Context for Language Model Compression
Geondo Park
Gyeongman Kim
Eunho Yang
81
38
0
17 Sep 2021
MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection
Matthew Matero
Nikita Soni
Niranjan Balasubramanian
H. Andrew Schwartz
80
21
0
16 Sep 2021
Language Models are Few-shot Multilingual Learners
Genta Indra Winata
Andrea Madotto
Zhaojiang Lin
Rosanne Liu
J. Yosinski
Pascale Fung
ELM
LRM
119
138
0
16 Sep 2021
Efficient Domain Adaptation of Language Models via Adaptive Tokenization
Vin Sachidananda
Jason S Kessler
Yi-An Lai
64
38
0
15 Sep 2021
Should We Be Pre-training? An Argument for End-task Aware Training as an Alternative
Lucio Dery
Paul Michel
Ameet Talwalkar
Graham Neubig
CLL
106
35
0
15 Sep 2021
Unsupervised Keyphrase Extraction by Jointly Modeling Local and Global Context
Xinnian Liang
Shuangzhi Wu
Mu Li
Zhoujun Li
89
62
0
15 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
87
28
0
15 Sep 2021
Explainable Identification of Dementia from Transcripts using Transformer Networks
Loukas Ilias
D. Askounis
80
39
0
14 Sep 2021
Semantic Answer Type Prediction using BERT: IAI at the ISWC SMART Task 2020
Vinay Setty
K. Balog
43
12
0
14 Sep 2021
KFCNet: Knowledge Filtering and Contrastive Learning Network for Generative Commonsense Reasoning
Haonan Li
Yeyun Gong
Jian Jiao
Ruofei Zhang
Timothy Baldwin
Nan Duan
OffRL
93
6
0
14 Sep 2021
Different Strokes for Different Folks: Investigating Appropriate Further Pre-training Approaches for Diverse Dialogue Tasks
Yao Qiu
Jinchao Zhang
Jie Zhou
61
5
0
14 Sep 2021
STraTA: Self-Training with Task Augmentation for Better Few-shot Learning
Tu Vu
Minh-Thang Luong
Quoc V. Le
Grady Simon
Mohit Iyyer
193
61
0
13 Sep 2021
KroneckerBERT: Learning Kronecker Decomposition for Pre-trained Language Models via Knowledge Distillation
Marzieh S. Tahaei
Ella Charlaix
V. Nia
A. Ghodsi
Mehdi Rezagholizadeh
110
22
0
13 Sep 2021
Not All Models Localize Linguistic Knowledge in the Same Place: A Layer-wise Probing on BERToids' Representations
Mohsen Fayyaz
Ehsan Aghazadeh
Ali Modarressi
Hosein Mohebbi
Mohammad Taher Pilehvar
35
21
0
13 Sep 2021
Question Answering over Electronic Devices: A New Benchmark Dataset and a Multi-Task Learning based QA Framework
Abhilash Nandy
Soumya Sharma
Shubham Maddhashiya
K. Sachdeva
Pawan Goyal
Niloy Ganguly
70
19
0
13 Sep 2021
Show Me How To Revise: Improving Lexically Constrained Sentence Generation with XLNet
Xingwei He
Victor O.K. Li
BDL
282
24
0
13 Sep 2021
How to Select One Among All? An Extensive Empirical Study Towards the Robustness of Knowledge Distillation in Natural Language Understanding
Tianda Li
Ahmad Rashid
A. Jafari
Pranav Sharma
A. Ghodsi
Mehdi Rezagholizadeh
AAML
122
5
0
13 Sep 2021
Raise a Child in Large Language Model: Towards Effective and Generalizable Fine-tuning
Runxin Xu
Fuli Luo
Zhiyuan Zhang
Chuanqi Tan
Baobao Chang
Songfang Huang
Fei Huang
LRM
211
190
0
13 Sep 2021
FLiText: A Faster and Lighter Semi-Supervised Text Classification with Convolution Networks
Chen Liu
Mengchao Zhang
Liang Pang
Jiafeng Guo
Xueqi Cheng
CLIP
66
19
0
12 Sep 2021
AdaK-NER: An Adaptive Top-K Approach for Named Entity Recognition with Incomplete Annotations
Hongtao Ruan
Liying Zheng
Peixian Hu
38
0
0
11 Sep 2021
Asking Questions Like Educational Experts: Automatically Generating Question-Answer Pairs on Real-World Examination Data
Fanyi Qu
Xin Jia
Hao Sun
AI4Ed
148
24
0
11 Sep 2021
FBERT: A Neural Transformer for Identifying Offensive Content
Diptanu Sarkar
Marcos Zampieri
Tharindu Ranasinghe
Alexander Ororbia
VLM
70
56
0
10 Sep 2021
Does Pretraining for Summarization Require Knowledge Transfer?
Kundan Krishna
Jeffrey P. Bigham
Zachary Chase Lipton
73
39
0
10 Sep 2021
Artificial Text Detection via Examining the Topology of Attention Maps
Laida Kushnareva
D. Cherniavskii
Vladislav Mikhailov
Ekaterina Artemova
S. Barannikov
A. Bernstein
Irina Piontkovskaya
D. Piontkovski
Evgeny Burnaev
104
51
0
10 Sep 2021
RoR: Read-over-Read for Long Document Machine Reading Comprehension
Jing Zhao
Junwei Bao
Yifan Wang
Yongwei Zhou
Youzheng Wu
Xiaodong He
Bowen Zhou
AIMat
114
24
0
10 Sep 2021
Counterfactual Adversarial Learning with Representation Interpolation
Wen Wang
Wei Ping
Ning Shi
Jinfeng Li
Bingyu Zhu
Xiangyu Liu
Rongxin Zhang
AAML
OOD
CML
76
2
0
10 Sep 2021
On the validity of pre-trained transformers for natural language processing in the software engineering domain
Julian von der Mosel
Alexander Trautsch
Steffen Herbold
74
68
0
10 Sep 2021
EfficientCLIP: Efficient Cross-Modal Pre-training by Ensemble Confident Learning and Language Modeling
Jue Wang
Haofan Wang
Jincan Deng
Weijia Wu
Debing Zhang
VLM
CLIP
118
19
0
10 Sep 2021
Augmenting BERT-style Models with Predictive Coding to Improve Discourse-level Representations
Vladimir Araujo
Andrés Villa
Marcelo Mendoza
Marie-Francine Moens
Alvaro Soto
61
7
0
10 Sep 2021
BERT, mBERT, or BiBERT? A Study on Contextualized Embeddings for Neural Machine Translation
Haoran Xu
Benjamin Van Durme
Kenton W. Murray
113
62
0
09 Sep 2021
AStitchInLanguageModels: Dataset and Methods for the Exploration of Idiomaticity in Pre-Trained Language Models
Harish Tayyar Madabushi
Edward Gow-Smith
Carolina Scarton
Aline Villavicencio
48
39
0
09 Sep 2021
All Bark and No Bite: Rogue Dimensions in Transformer Language Models Obscure Representational Quality
William Timkey
Marten van Schijndel
310
116
0
09 Sep 2021
MetaXT: Meta Cross-Task Transfer between Disparate Label Spaces
Srinagesh Sharma
Guoqing Zheng
Ahmed Hassan Awadallah
50
1
0
09 Sep 2021
KELM: Knowledge Enhanced Pre-Trained Language Representations with Message Passing on Hierarchical Relational Graphs
Yinquan Lu
H. Lu
Guirong Fu
Qun Liu
KELM
46
34
0
09 Sep 2021
Efficient Nearest Neighbor Language Models
Junxian He
Graham Neubig
Taylor Berg-Kirkpatrick
RALM
278
106
0
09 Sep 2021
Self- and Pseudo-self-supervised Prediction of Speaker and Key-utterance for Multi-party Dialogue Reading Comprehension
Yiyang Li
Hai Zhao
75
24
0
08 Sep 2021
Discrete and Soft Prompting for Multilingual Models
Mengjie Zhao
Hinrich Schütze
LRM
92
72
0
08 Sep 2021
NSP-BERT: A Prompt-based Few-Shot Learner Through an Original Pre-training Task--Next Sentence Prediction
Yi Sun
Yu Zheng
Chao Hao
Hangping Qiu
VLM
107
37
0
08 Sep 2021
Cross-lingual Offensive Language Identification for Low Resource Languages: The Case of Marathi
Saurabh Gaikwad
Tharindu Ranasinghe
Marcos Zampieri
Christopher Homan
89
65
0
08 Sep 2021
Beyond Preserved Accuracy: Evaluating Loyalty and Robustness of BERT Compression
Canwen Xu
Wangchunshu Zhou
Tao Ge
Kelvin J. Xu
Julian McAuley
Furu Wei
76
42
0
07 Sep 2021
Generate & Rank: A Multi-task Framework for Math Word Problems
Jianhao Shen
Yichun Yin
Lin Li
Lifeng Shang
Xin Jiang
Ming Zhang
Qun Liu
AIMat
93
133
0
07 Sep 2021
Sequential Attention Module for Natural Language Processing
Mengyuan Zhou
Jian Ma
Haiqing Yang
Lian-Xin Jiang
Yang Mo
AI4TS
43
2
0
07 Sep 2021
GPT-3 Models are Poor Few-Shot Learners in the Biomedical Domain
M. Moradi
Kathrin Blagec
F. Haberl
Matthias Samwald
LM&MA
AI4MH
103
66
0
06 Sep 2021
PermuteFormer: Efficient Relative Position Encoding for Long Sequences
Peng-Jen Chen
93
21
0
06 Sep 2021
LAViTeR: Learning Aligned Visual and Textual Representations Assisted by Image and Caption Generation
Mohammad Abuzar Shaikh
Zhanghexuan Ji
Dana Moukheiber
Yan Shen
S. Srihari
Mingchen Gao
VLM
65
1
0
04 Sep 2021
Frustratingly Simple Pretraining Alternatives to Masked Language Modeling
Atsuki Yamaguchi
G. Chrysostomou
Katerina Margatina
Nikolaos Aletras
69
25
0
04 Sep 2021
Hybrid Contrastive Learning of Tri-Modal Representation for Multimodal Sentiment Analysis
Sijie Mai
Ying Zeng
Shuangjia Zheng
Haifeng Hu
82
125
0
04 Sep 2021
A Bayesian Approach to (Online) Transfer Learning: Theory and Algorithms
Xuetong Wu
J. Manton
U. Aickelin
Jingge Zhu
74
17
0
03 Sep 2021
An Empirical Exploration in Quality Filtering of Text Data
Leo Gao
70
11
0
02 Sep 2021
CTAL: Pre-training Cross-modal Transformer for Audio-and-Language Representations
Hang Li
Yunxing Kang
Tianqiao Liu
Wenbiao Ding
Zitao Liu
78
19
0
01 Sep 2021
Previous
1
2
3
...
38
39
40
...
69
70
71
Next