ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,520 papers shown
Title
RevUp: Revise and Update Information Bottleneck for Event Representation
RevUp: Revise and Update Information Bottleneck for Event Representation
Mehdi Rezaee
Francis Ferraro
88
1
0
24 May 2022
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked
  Auto-Encoder
RetroMAE: Pre-Training Retrieval-oriented Language Models Via Masked Auto-Encoder
Shitao Xiao
Zheng Liu
Yingxia Shao
Bo Zhao
RALM
280
126
0
24 May 2022
Analysing the Greek Parliament Records with Emotion Classification
Analysing the Greek Parliament Records with Emotion Classification
John Pavlopoulos
Vanessa Lislevand
52
2
0
24 May 2022
Community Question Answering Entity Linking via Leveraging Auxiliary
  Data
Community Question Answering Entity Linking via Leveraging Auxiliary Data
Yuhan Li
Wei Shen
Jianbo Gao
Yadong Wang
89
11
0
24 May 2022
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models
  of Source Code
Deep Learning Meets Software Engineering: A Survey on Pre-Trained Models of Source Code
Changan Niu
Chuanyi Li
Bin Luo
Vincent Ng
SyDaVLM
107
50
0
24 May 2022
On the Role of Bidirectionality in Language Model Pre-Training
On the Role of Bidirectionality in Language Model Pre-Training
Mikel Artetxe
Jingfei Du
Naman Goyal
Luke Zettlemoyer
Ves Stoyanov
202
17
0
24 May 2022
FlexiBERT: Are Current Transformer Architectures too Homogeneous and
  Rigid?
FlexiBERT: Are Current Transformer Architectures too Homogeneous and Rigid?
Shikhar Tuli
Bhishma Dedhia
Shreshth Tuli
N. Jha
94
14
0
23 May 2022
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual
  Style Transfer with Small Language Models
Prompt-and-Rerank: A Method for Zero-Shot and Few-Shot Arbitrary Textual Style Transfer with Small Language Models
Mirac Suzgun
Luke Melas-Kyriazi
Dan Jurafsky
VLM
129
67
0
23 May 2022
A Question-Answer Driven Approach to Reveal Affirmative Interpretations
  from Verbal Negations
A Question-Answer Driven Approach to Reveal Affirmative Interpretations from Verbal Negations
Md Mosharaf Hossain
L. Holman
Anusha Kakileti
T. Kao
N. Brito
A. Mathews
Eduardo Blanco
51
4
0
23 May 2022
Prompt Tuning for Discriminative Pre-trained Language Models
Prompt Tuning for Discriminative Pre-trained Language Models
Yuan Yao
Bowen Dong
Ao Zhang
Zhengyan Zhang
Ruobing Xie
Zhiyuan Liu
Leyu Lin
Maosong Sun
Jianyong Wang
VLM
80
34
0
23 May 2022
Computational Storytelling and Emotions: A Survey
Computational Storytelling and Emotions: A Survey
Yusuke Mori
Hiroaki Yamane
Yusuke Mukuta
Tatsuya Harada
88
2
0
23 May 2022
Sequence-to-Action: Grammatical Error Correction with Action Guided
  Sequence Generation
Sequence-to-Action: Grammatical Error Correction with Action Guided Sequence Generation
Jiquan Li
Junliang Guo
Yongxin Zhu
Xin Sheng
Deqiang Jiang
Bo Ren
Linli Xu
110
24
0
22 May 2022
What Do Compressed Multilingual Machine Translation Models Forget?
What Do Compressed Multilingual Machine Translation Models Forget?
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
AI4CE
100
10
0
22 May 2022
GraphMAE: Self-Supervised Masked Graph Autoencoders
GraphMAE: Self-Supervised Masked Graph Autoencoders
Zhenyu Hou
Xiao Liu
Yukuo Cen
Yuxiao Dong
Hongxia Yang
C. Wang
Jie Tang
SSL
147
593
0
22 May 2022
Self-Supervised Speech Representation Learning: A Review
Self-Supervised Speech Representation Learning: A Review
Abdel-rahman Mohamed
Hung-yi Lee
Lasse Borgholt
Jakob Drachmann Havtorn
Joakim Edin
...
Shang-Wen Li
Karen Livescu
Lars Maaløe
Tara N. Sainath
Shinji Watanabe
SSLAI4TS
287
368
0
21 May 2022
DeepStruct: Pretraining of Language Models for Structure Prediction
DeepStruct: Pretraining of Language Models for Structure Prediction
Chenguang Wang
Xiao Liu
Zui Chen
Haoyun Hong
Jie Tang
Dawn Song
278
71
0
21 May 2022
Pre-training Transformer Models with Sentence-Level Objectives for
  Answer Sentence Selection
Pre-training Transformer Models with Sentence-Level Objectives for Answer Sentence Selection
Luca Di Liello
Siddhant Garg
Luca Soldaini
Alessandro Moschitti
71
17
0
20 May 2022
Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of
  Social Media Posts
Forecasting COVID-19 Caseloads Using Unsupervised Embedding Clusters of Social Media Posts
Felix Drinkall
S. Zohren
J. Pierrehumbert
45
5
0
20 May 2022
Heterformer: Transformer-based Deep Node Representation Learning on
  Heterogeneous Text-Rich Networks
Heterformer: Transformer-based Deep Node Representation Learning on Heterogeneous Text-Rich Networks
Bowen Jin
Yu Zhang
Qi Zhu
Jiawei Han
138
41
0
20 May 2022
Progressive Class Semantic Matching for Semi-supervised Text
  Classification
Progressive Class Semantic Matching for Semi-supervised Text Classification
Hai-Ming Xu
Lingqiao Liu
Ehsan Abbasnejad
VLM
68
11
0
20 May 2022
Prototypical Calibration for Few-shot Learning of Language Models
Prototypical Calibration for Few-shot Learning of Language Models
Zhixiong Han
Y. Hao
Li Dong
Yutao Sun
Furu Wei
266
56
0
20 May 2022
Visually-Augmented Language Modeling
Visually-Augmented Language Modeling
Weizhi Wang
Li Dong
Hao Cheng
Haoyu Song
Xiaodong Liu
Xifeng Yan
Jianfeng Gao
Furu Wei
VLM
89
18
0
20 May 2022
MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer
  with Multi-Stage Fusion
MSTRIQ: No Reference Image Quality Assessment Based on Swin Transformer with Multi-Stage Fusion
Jing Wang
Haotian Fa
X. Hou
Yitian Xu
Tao Li
X. Lu
Lean Fu
80
21
0
20 May 2022
How to keep text private? A systematic review of deep learning methods
  for privacy-preserving natural language processing
How to keep text private? A systematic review of deep learning methods for privacy-preserving natural language processing
Samuel Sousa
Roman Kern
PILMAILaw
79
46
0
20 May 2022
KERPLE: Kernelized Relative Positional Embedding for Length
  Extrapolation
KERPLE: Kernelized Relative Positional Embedding for Length Extrapolation
Ta-Chung Chi
Ting-Han Fan
Peter J. Ramadge
Alexander I. Rudnicky
100
73
0
20 May 2022
Transformer with Memory Replay
Transformer with Memory Replay
R. Liu
Barzan Mozafari
OffRL
105
4
0
19 May 2022
VNT-Net: Rotational Invariant Vector Neuron Transformers
VNT-Net: Rotational Invariant Vector Neuron Transformers
Hedi Zisling
Andrei Sharf
3DPC
53
1
0
19 May 2022
Transformers as Neural Augmentors: Class Conditional Sentence Generation
  via Variational Bayes
Transformers as Neural Augmentors: Class Conditional Sentence Generation via Variational Bayes
M. Bilici
M. Amasyalı
ViT
61
2
0
19 May 2022
TransTab: Learning Transferable Tabular Transformers Across Tables
TransTab: Learning Transferable Tabular Transformers Across Tables
Zifeng Wang
Jimeng Sun
LMTD
85
151
0
19 May 2022
PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot
  Learners
PromptDA: Label-guided Data Augmentation for Prompt-based Few-shot Learners
Canyu Chen
Kai Shu
VLM
104
8
0
18 May 2022
Trading Positional Complexity vs. Deepness in Coordinate Networks
Trading Positional Complexity vs. Deepness in Coordinate Networks
Jianqiao Zheng
Sameera Ramasinghe
Xueqian Li
Simon Lucey
101
19
0
18 May 2022
Exploiting Social Media Content for Self-Supervised Style Transfer
Exploiting Social Media Content for Self-Supervised Style Transfer
Dana Ruiter
Thomas Kleinbauer
C. España-Bonet
Josef van Genabith
Dietrich Klakow
85
2
0
18 May 2022
PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for
  Multi-stage Ranking
PASH at TREC 2021 Deep Learning Track: Generative Enhanced Model for Multi-stage Ranking
Yixuan Qiao
Hao Chen
Jun Wang
Yongquan Lai
Tuozhen Liu
...
Xin Tang
Rui Fang
Peng Gao
Wenfeng Xie
Guotong Xie
51
1
0
18 May 2022
MulT: An End-to-End Multitask Learning Transformer
MulT: An End-to-End Multitask Learning Transformer
Deblina Bhattacharjee
Tong Zhang
Sabine Süsstrunk
Mathieu Salzmann
ViT
116
68
0
17 May 2022
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep
  Neural Network, a Survey
Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep Neural Network, a Survey
Paul Wimmer
Jens Mehnert
Alexandru Paul Condurache
DD
98
21
0
17 May 2022
A Fast Attention Network for Joint Intent Detection and Slot Filling on
  Edge Devices
A Fast Attention Network for Joint Intent Detection and Slot Filling on Edge Devices
Liang Huang
Senjie Liang
Feiyang Ye
Nan Gao
93
4
0
16 May 2022
TiBERT: Tibetan Pre-trained Language Model
TiBERT: Tibetan Pre-trained Language Model
Yuan Sun
Sisi Liu
Junjie Deng
Xiaobing Zhao
94
10
0
15 May 2022
Discovering Latent Concepts Learned in BERT
Discovering Latent Concepts Learned in BERT
Fahim Dalvi
A. Khan
Firoj Alam
Nadir Durrani
Jia Xu
Hassan Sajjad
SSL
50
61
0
15 May 2022
Fair Bayes-Optimal Classifiers Under Predictive Parity
Fair Bayes-Optimal Classifiers Under Predictive Parity
Xianli Zeng
Yan Sun
Guang Cheng
FaML
109
14
0
15 May 2022
Hero-Gang Neural Model For Named Entity Recognition
Hero-Gang Neural Model For Named Entity Recognition
Jinpeng Hu
Yaling Shen
Yang Liu
Xiang Wan
Tsung-Hui Chang
62
15
0
15 May 2022
From Cognitive to Computational Modeling: Text-based Risky
  Decision-Making Guided by Fuzzy Trace Theory
From Cognitive to Computational Modeling: Text-based Risky Decision-Making Guided by Fuzzy Trace Theory
Jaron Mar
Jiamou Liu
76
2
0
15 May 2022
Improving Contextual Representation with Gloss Regularized Pre-training
Improving Contextual Representation with Gloss Regularized Pre-training
Yu Lin
Zhecheng An
Peihao Wu
Zejun Ma
83
5
0
13 May 2022
The Mechanism of Prediction Head in Non-contrastive Self-supervised
  Learning
The Mechanism of Prediction Head in Non-contrastive Self-supervised Learning
Zixin Wen
Yuanzhi Li
SSL
116
35
0
12 May 2022
Predicting Human Psychometric Properties Using Computational Language
  Models
Predicting Human Psychometric Properties Using Computational Language Models
Antonio Laverghetta
Animesh Nighojkar
Jamshidbek Mirzakhalov
John Licato
62
9
0
12 May 2022
e-CARE: a New Dataset for Exploring Explainable Causal Reasoning
e-CARE: a New Dataset for Exploring Explainable Causal Reasoning
Li Du
Xiao Ding
Kai Xiong
Ting Liu
Bing Qin
CML
82
67
0
12 May 2022
DISARM: Detecting the Victims Targeted by Harmful Memes
DISARM: Detecting the Victims Targeted by Harmful Memes
Shivam Sharma
Md. Shad Akhtar
Preslav Nakov
Tanmoy Chakraborty
71
32
0
11 May 2022
Towards Unified Prompt Tuning for Few-shot Text Classification
Towards Unified Prompt Tuning for Few-shot Text Classification
Jiadong Wang
Chengyu Wang
Fuli Luo
Chuanqi Tan
Minghui Qiu
Fei Yang
Qiuhui Shi
Songfang Huang
Ming Gao
VLM
70
28
0
11 May 2022
Massively Digitized Power Grid: Opportunities and Challenges of
  Use-inspired AI
Massively Digitized Power Grid: Opportunities and Challenges of Use-inspired AI
Le Xie
Xiangtian Zheng
Yannan Sun
Tong Huang
Tony Bruton
AI4CE
60
19
0
10 May 2022
Sibylvariant Transformations for Robust Text Classification
Sibylvariant Transformations for Robust Text Classification
Fabrice Harel-Canada
Muhammad Ali Gulzar
Nanyun Peng
Miryung Kim
AAMLVLM
78
4
0
10 May 2022
UL2: Unifying Language Learning Paradigms
UL2: Unifying Language Learning Paradigms
Yi Tay
Mostafa Dehghani
Vinh Q. Tran
Xavier Garcia
Jason W. Wei
...
Tal Schuster
H. Zheng
Denny Zhou
N. Houlsby
Donald Metzler
AI4CE
141
313
0
10 May 2022
Previous
123...282930...697071
Next