ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,521 papers shown
Title
Named Entity Recognition in the Style of Object Detection
Named Entity Recognition in the Style of Object Detection
Bing Li
49
4
0
26 Jan 2021
Muppet: Massive Multi-task Representations with Pre-Finetuning
Muppet: Massive Multi-task Representations with Pre-Finetuning
Armen Aghajanyan
Anchit Gupta
Akshat Shrivastava
Xilun Chen
Luke Zettlemoyer
Sonal Gupta
100
270
0
26 Jan 2021
Training Multilingual Pre-trained Language Model with Byte-level
  Subwords
Training Multilingual Pre-trained Language Model with Byte-level Subwords
Junqiu Wei
Qun Liu
Yinpeng Guo
Xin Jiang
63
20
0
23 Jan 2021
Enhanced word embeddings using multi-semantic representation through
  lexical chains
Enhanced word embeddings using multi-semantic representation through lexical chains
Terry Ruas
C. H. P. Ferreira
W. Grosky
F. O. França
D. D. Medeiros
90
18
0
22 Jan 2021
Distilling Large Language Models into Tiny and Effective Students using
  pQRNN
Distilling Large Language Models into Tiny and Effective Students using pQRNN
P. Kaliamoorthi
Aditya Siddhant
Edward Li
Melvin Johnson
MQ
60
17
0
21 Jan 2021
Learning rich touch representations through cross-modal self-supervision
Learning rich touch representations through cross-modal self-supervision
Martina Zambelli
Y. Aytar
Francesco Visin
Yuxiang Zhou
R. Hadsell
SSL
82
16
0
21 Jan 2021
Open-Domain Conversational Search Assistant with Transformers
Open-Domain Conversational Search Assistant with Transformers
Rafael Ferreira
Mariana Leite
David Semedo
João Magalhães
46
11
0
20 Jan 2021
SUGAR: Subgraph Neural Network with Reinforcement Pooling and
  Self-Supervised Mutual Information Mechanism
SUGAR: Subgraph Neural Network with Reinforcement Pooling and Self-Supervised Mutual Information Mechanism
Qingyun Sun
Jianxin Li
Hao Peng
Hongzhi Zhang
Yuanxing Ning
Phillip S. Yu
Lifang He
71
168
0
20 Jan 2021
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation
Learning to Augment for Data-Scarce Domain BERT Knowledge Distillation
Lingyun Feng
Minghui Qiu
Yaliang Li
Haitao Zheng
Ying Shen
97
10
0
20 Jan 2021
Situation and Behavior Understanding by Trope Detection on Films
Situation and Behavior Understanding by Trope Detection on Films
Chen-Hsi Chang
Hung-Ting Su
Jui-Heng Hsu
Yu-Siang Wang
Yu-Cheng Chang
Zhe-Yu Liu
Ya-Liang Chang
Wen-Feng Cheng
Ke-Jyun Wang
Winston H. Hsu
122
7
0
19 Jan 2021
HinFlair: pre-trained contextual string embeddings for pos tagging and
  text classification in the Hindi language
HinFlair: pre-trained contextual string embeddings for pos tagging and text classification in the Hindi language
Harsh Patel
VLMAI4CE
31
1
0
18 Jan 2021
What Makes Good In-Context Examples for GPT-$3$?
What Makes Good In-Context Examples for GPT-333?
Jiachang Liu
Dinghan Shen
Yizhe Zhang
Bill Dolan
Lawrence Carin
Weizhu Chen
AAMLRALM
400
1,399
0
17 Jan 2021
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for
  Low-resource Speech Recognition
Efficiently Fusing Pretrained Acoustic and Linguistic Encoders for Low-resource Speech Recognition
Cheng Yi
Shiyu Zhou
Bo Xu
108
40
0
17 Jan 2021
ComQA:Compositional Question Answering via Hierarchical Graph Neural
  Networks
ComQA:Compositional Question Answering via Hierarchical Graph Neural Networks
Bingning Wang
Ting Yao
Weipeng Chen
Jingfang Xu
Xiaochuan Wang
CoGe
70
6
0
16 Jan 2021
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
LIME: Learning Inductive Bias for Primitives of Mathematical Reasoning
Yuhuai Wu
M. Rabe
Wenda Li
Jimmy Ba
Roger C. Grosse
Christian Szegedy
AIMatLRM
144
57
0
15 Jan 2021
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with
  Learned Step Size Quantization
KDLSQ-BERT: A Quantized Bert Combining Knowledge Distillation with Learned Step Size Quantization
Jing Jin
Cai Liang
Tiancheng Wu
Li Zou
Zhiliang Gan
MQ
59
27
0
15 Jan 2021
Training Data Leakage Analysis in Language Models
Training Data Leakage Analysis in Language Models
Huseyin A. Inan
Osman Ramadan
Lukas Wutschitz
Daniel Jones
Victor Rühle
James Withers
Robert Sim
MIACVPILM
98
9
0
14 Jan 2021
Fake News Detection System using XLNet model with Topic Distributions:
  CONSTRAINT@AAAI2021 Shared Task
Fake News Detection System using XLNet model with Topic Distributions: CONSTRAINT@AAAI2021 Shared Task
Akansha Gautam
Venktesh V
Sarah Masud
71
32
0
12 Jan 2021
Of Non-Linearity and Commutativity in BERT
Of Non-Linearity and Commutativity in BERT
Sumu Zhao
Damian Pascual
Gino Brunner
Roger Wattenhofer
105
17
0
12 Jan 2021
Robustness of on-device Models: Adversarial Attack to Deep Learning
  Models on Android Apps
Robustness of on-device Models: Adversarial Attack to Deep Learning Models on Android Apps
Yujin Huang
Han Hu
Chunyang Chen
AAMLFedML
115
33
0
12 Jan 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
172
2,248
0
11 Jan 2021
A Heuristic-driven Ensemble Framework for COVID-19 Fake News Detection
A Heuristic-driven Ensemble Framework for COVID-19 Fake News Detection
Sourya Dipta Das
Ayan Basak
S. Dutta
86
39
0
10 Jan 2021
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation
  Slides using Contextualized Embeddings
Cisco at AAAI-CAD21 shared task: Predicting Emphasis in Presentation Slides using Contextualized Embeddings
Sreyan Ghosh
Sonal Kumar
H. Jalan
Hemant Yadav
R. Shah
84
2
0
10 Jan 2021
BERT & Family Eat Word Salad: Experiments with Text Understanding
BERT & Family Eat Word Salad: Experiments with Text Understanding
Ashim Gupta
Giorgi Kvernadze
Vivek Srikumar
260
73
0
10 Jan 2021
LightXML: Transformer with Dynamic Negative Sampling for
  High-Performance Extreme Multi-label Text Classification
LightXML: Transformer with Dynamic Negative Sampling for High-Performance Extreme Multi-label Text Classification
Ting Jiang
Deqing Wang
Leilei Sun
Huayi Yang
Zhengyang Zhao
Fuzhen Zhuang
VLM
208
140
0
09 Jan 2021
Simplified DOM Trees for Transferable Attribute Extraction from the Web
Simplified DOM Trees for Transferable Attribute Extraction from the Web
Yichao Zhou
Ying Sheng
N. Vo
Nick Edmonds
Sandeep Tata
180
29
0
07 Jan 2021
Applying Transfer Learning for Improving Domain-Specific Search
  Experience Using Query to Question Similarity
Applying Transfer Learning for Improving Domain-Specific Search Experience Using Query to Question Similarity
Ankush Chopra
S. Agrawal
Sohom Ghosh
RALM
37
4
0
07 Jan 2021
TextBox: A Unified, Modularized, and Extensible Framework for Text
  Generation
TextBox: A Unified, Modularized, and Extensible Framework for Text Generation
Junyi Li
Tianyi Tang
Gaole He
Jinhao Jiang
Xiaoxuan Hu
Puzhao Xie
Zhipeng Chen
Zhuohao Yu
Wayne Xin Zhao
Ji-Rong Wen
135
25
0
06 Jan 2021
I-BERT: Integer-only BERT Quantization
I-BERT: Integer-only BERT Quantization
Sehoon Kim
A. Gholami
Z. Yao
Michael W. Mahoney
Kurt Keutzer
MQ
181
354
0
05 Jan 2021
Retrieving and Reading: A Comprehensive Survey on Open-domain Question
  Answering
Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering
Fengbin Zhu
Wenqiang Lei
Chao Wang
Jianming Zheng
Soujanya Poria
Tat-Seng Chua
RALM
284
257
0
04 Jan 2021
Benchmarking Knowledge-Enhanced Commonsense Question Answering via
  Knowledge-to-Text Transformation
Benchmarking Knowledge-Enhanced Commonsense Question Answering via Knowledge-to-Text Transformation
Ning Bian
Xianpei Han
Bo Chen
Le Sun
ELM
49
43
0
04 Jan 2021
Recoding latent sentence representations -- Dynamic gradient-based
  activation modification in RNNs
Recoding latent sentence representations -- Dynamic gradient-based activation modification in RNNs
Dennis Ulmer
53
0
0
03 Jan 2021
Few-Shot Question Answering by Pretraining Span Selection
Few-Shot Question Answering by Pretraining Span Selection
Ori Ram
Yuval Kirstain
Jonathan Berant
Amir Globerson
Omer Levy
104
98
0
02 Jan 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
83
16
0
02 Jan 2021
CDLM: Cross-Document Language Modeling
CDLM: Cross-Document Language Modeling
Avi Caciularu
Arman Cohan
Iz Beltagy
Matthew E. Peters
Arie Cattan
Ido Dagan
VLM
75
33
0
02 Jan 2021
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's
  Interpretation of Complex Words
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words
Valentin Hofmann
J. Pierrehumbert
Hinrich Schütze
129
72
0
02 Jan 2021
Lex-BERT: Enhancing BERT based NER with lexicons
Wei-wei Zhu
Daniel Cheung
73
8
0
02 Jan 2021
End-to-end Semantic Role Labeling with Neural Transition-based Model
End-to-end Semantic Role Labeling with Neural Transition-based Model
Hao Fei
Meishan Zhang
Bobo Li
Donghong Ji
OffRL
59
37
0
02 Jan 2021
Learning to Emphasize: Dataset and Shared Task Models for Selecting
  Emphasis in Presentation Slides
Learning to Emphasize: Dataset and Shared Task Models for Selecting Emphasis in Presentation Slides
Amirreza Shirani
Gia-Lac Tran
Hieu Trinh
Franck Dernoncourt
Nedim Lipka
P. Asente
J. Echevarria
Thamar Solorio
306
1
0
02 Jan 2021
A Robust and Domain-Adaptive Approach for Low-Resource Named Entity
  Recognition
A Robust and Domain-Adaptive Approach for Low-Resource Named Entity Recognition
Houjin Yu
Xian-Ling Mao
Zewen Chi
Wei Wei
Heyan Huang
217
12
0
02 Jan 2021
VisualSparta: An Embarrassingly Simple Approach to Large-scale
  Text-to-Image Search with Weighted Bag-of-words
VisualSparta: An Embarrassingly Simple Approach to Large-scale Text-to-Image Search with Weighted Bag-of-words
Xiaopeng Lu
Tiancheng Zhao
Kyusong Lee
71
27
0
01 Jan 2021
BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource
  Language Understanding Evaluation in Bangla
BanglaBERT: Language Model Pretraining and Benchmarks for Low-Resource Language Understanding Evaluation in Bangla
Abhik Bhattacharjee
Tahmid Hasan
Wasi Uddin Ahmad
Kazi Samin Mubasshir
Md. Saiful Islam
Anindya Iqbal
M. Rahman
Rifat Shahriyar
SSLVLM
101
180
0
01 Jan 2021
On Explaining Your Explanations of BERT: An Empirical Study with
  Sequence Classification
On Explaining Your Explanations of BERT: An Empirical Study with Sequence Classification
Zhengxuan Wu
Desmond C. Ong
78
22
0
01 Jan 2021
Transformer based Automatic COVID-19 Fake News Detection System
Transformer based Automatic COVID-19 Fake News Detection System
Sunil Gundapu
R. Mamidi
91
72
0
01 Jan 2021
How Do Your Biomedical Named Entity Recognition Models Generalize to
  Novel Entities?
How Do Your Biomedical Named Entity Recognition Models Generalize to Novel Entities?
Hyunjae Kim
Jaewoo Kang
AI4CE
161
21
0
01 Jan 2021
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets
Xiaohan Chen
Yu Cheng
Shuohang Wang
Zhe Gan
Zhangyang Wang
Jingjing Liu
131
100
0
31 Dec 2020
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective
  with Transformers
Rethinking Semantic Segmentation from a Sequence-to-Sequence Perspective with Transformers
Sixiao Zheng
Jiachen Lu
Hengshuang Zhao
Xiatian Zhu
Zekun Luo
...
Yanwei Fu
Jianfeng Feng
Tao Xiang
Philip Torr
Li Zhang
ViT
206
2,928
0
31 Dec 2020
MiniLMv2: Multi-Head Self-Attention Relation Distillation for
  Compressing Pretrained Transformers
MiniLMv2: Multi-Head Self-Attention Relation Distillation for Compressing Pretrained Transformers
Wenhui Wang
Hangbo Bao
Shaohan Huang
Li Dong
Furu Wei
MQ
139
276
0
31 Dec 2020
Understanding Politics via Contextualized Discourse Processing
Understanding Politics via Contextualized Discourse Processing
Rajkumar Pujari
Dan Goldwasser
57
20
0
31 Dec 2020
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
ERNIE-Doc: A Retrospective Long-Document Modeling Transformer
Siyu Ding
Junyuan Shang
Shuohuan Wang
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
116
55
0
31 Dec 2020
Previous
123...495051...697071
Next