ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,520 papers shown
Title
Ensembling and Knowledge Distilling of Large Sequence Taggers for
  Grammatical Error Correction
Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction
M. Tarnavskyi
Artem Chernodub
Kostiantyn Omelianchuk
3DV
59
26
0
24 Mar 2022
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through
  Regularized Self-Attention
ERNIE-SPARSE: Learning Hierarchical Efficient Transformer Through Regularized Self-Attention
Yang Liu
Jiaxiang Liu
L. Chen
Yuxiang Lu
Shi Feng
Zhida Feng
Yu Sun
Hao Tian
Huancheng Wu
Hai-feng Wang
70
9
0
23 Mar 2022
Towards Expressive Speaking Style Modelling with Hierarchical Context
  Information for Mandarin Speech Synthesis
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Zhiyong Wu
Shiyin Kang
Helen Meng
52
12
0
23 Mar 2022
Transformer based ensemble for emotion detection
Transformer based ensemble for emotion detection
Aditya Kane
Shantanu Patankar
Sahil Khose
Neeraja Kirtane
73
10
0
22 Mar 2022
Factual Consistency of Multilingual Pretrained Language Models
Factual Consistency of Multilingual Pretrained Language Models
Constanza Fierro
Anders Søgaard
HILM
68
16
0
22 Mar 2022
Task-guided Disentangled Tuning for Pretrained Language Models
Task-guided Disentangled Tuning for Pretrained Language Models
Jiali Zeng
Yu Jiang
Shuangzhi Wu
Yongjing Yin
Mu Li
DRL
150
3
0
22 Mar 2022
Suum Cuique: Studying Bias in Taboo Detection with a Community
  Perspective
Suum Cuique: Studying Bias in Taboo Detection with a Community Perspective
Osama Khalid
Jonathan Rusert
P. Srinivasan
27
1
0
22 Mar 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds
Masked Discrimination for Self-Supervised Learning on Point Clouds
Haotian Liu
Mu Cai
Yong Jae Lee
3DPC
126
172
0
21 Mar 2022
XTREME-S: Evaluating Cross-lingual Speech Representations
XTREME-S: Evaluating Cross-lingual Speech Representations
Alexis Conneau
Ankur Bapna
Yu Zhang
Min Ma
Patrick von Platen
...
Orhan Firat
Michael Auli
Sebastian Ruder
Jason Riesa
Melvin Johnson
VLMAILawELM
155
22
0
21 Mar 2022
Cluster & Tune: Boost Cold Start Performance in Text Classification
Cluster & Tune: Boost Cold Start Performance in Text Classification
Eyal Shnarch
Ariel Gera
Alon Halfon
Lena Dankin
Leshem Choshen
R. Aharonov
Noam Slonim
67
22
0
20 Mar 2022
How does the pre-training objective affect what large language models
  learn about linguistic properties?
How does the pre-training objective affect what large language models learn about linguistic properties?
Ahmed Alajrami
Nikolaos Aletras
82
20
0
20 Mar 2022
On Robust Prefix-Tuning for Text Classification
On Robust Prefix-Tuning for Text Classification
Zonghan Yang
Yang Liu
VLM
68
21
0
19 Mar 2022
Pretraining with Artificial Language: Studying Transferable Knowledge in
  Language Models
Pretraining with Artificial Language: Studying Transferable Knowledge in Language Models
Ryokan Ri
Yoshimasa Tsuruoka
89
28
0
19 Mar 2022
DP-KB: Data Programming with Knowledge Bases Improves Transformer Fine
  Tuning for Answer Sentence Selection
DP-KB: Data Programming with Knowledge Bases Improves Transformer Fine Tuning for Answer Sentence Selection
Nic Jedema
Thuy Vu
Manish Gupta
Alessandro Moschitti
50
1
0
17 Mar 2022
Leveraging Adversarial Examples to Quantify Membership Information
  Leakage
Leveraging Adversarial Examples to Quantify Membership Information Leakage
Ganesh Del Grosso
Hamid Jalalzai
Georg Pichler
C. Palamidessi
Pablo Piantanida
MIACV
77
23
0
17 Mar 2022
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
Paritosh Mittal
Y. Cheng
Maneesh Singh
Shubham Tulsiani
130
230
0
17 Mar 2022
elBERto: Self-supervised Commonsense Learning for Question Answering
elBERto: Self-supervised Commonsense Learning for Question Answering
Xunlin Zhan
Yuan Li
Xiao Dong
Xiaodan Liang
Zhiting Hu
Lawrence Carin
SSLRALMLRM
77
8
0
17 Mar 2022
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with
  Large-Scale Pre-Training
EVA2.0: Investigating Open-Domain Chinese Dialogue Systems with Large-Scale Pre-Training
Yuxian Gu
Jiaxin Wen
Hao Sun
Yi Song
Pei Ke
...
Zheng Zhang
Jianzhu Yao
Lei Liu
Xiaoyan Zhu
Minlie Huang
93
55
0
17 Mar 2022
Finding Structural Knowledge in Multimodal-BERT
Finding Structural Knowledge in Multimodal-BERT
Victor Milewski
Miryam de Lhoneux
Marie-Francine Moens
72
10
0
17 Mar 2022
PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction
  Transformer
PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer
Lina Achaji
Thierno Barry
Thibault Fouqueray
Julien Moreau
François Aioun
François Charpillet
92
16
0
17 Mar 2022
Confidence Calibration for Intent Detection via Hyperspherical Space and
  Rebalanced Accuracy-Uncertainty Loss
Confidence Calibration for Intent Detection via Hyperspherical Space and Rebalanced Accuracy-Uncertainty Loss
Yantao Gong
Cao Liu
Fan Yang
Xunliang Cai
Guanglu Wan
Jiansong Chen
Weipeng Zhang
Houfeng Wang
UQCV
60
2
0
17 Mar 2022
Type-Driven Multi-Turn Corrections for Grammatical Error Correction
Type-Driven Multi-Turn Corrections for Grammatical Error Correction
Shaopeng Lai
Qingyu Zhou
Jiali Zeng
Zhongli Li
Chao Li
Yunbo Cao
Jinsong Su
KELM
46
15
0
17 Mar 2022
Are Vision Transformers Robust to Spurious Correlations?
Are Vision Transformers Robust to Spurious Correlations?
Soumya Suvra Ghosal
Yifei Ming
Yixuan Li
ViT
71
35
0
17 Mar 2022
UNIMO-2: End-to-End Unified Vision-Language Grounded Learning
UNIMO-2: End-to-End Unified Vision-Language Grounded Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua Wu
Haifeng Wang
MLLM
51
22
0
17 Mar 2022
Modular and Parameter-Efficient Multimodal Fusion with Prompting
Modular and Parameter-Efficient Multimodal Fusion with Prompting
Sheng Liang
Mengjie Zhao
Hinrich Schütze
93
45
0
15 Mar 2022
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models
  Robust with Little Cost
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost
Lihu Chen
Gaël Varoquaux
Fabian M. Suchanek
77
15
0
15 Mar 2022
Unsupervised Keyphrase Extraction via Interpretable Neural Networks
Unsupervised Keyphrase Extraction via Interpretable Neural Networks
Rishabh Joshi
Vidhisha Balachandran
Emily Saldanha
M. Glenski
Svitlana Volkova
Yulia Tsvetkov
SSL
85
1
0
15 Mar 2022
VAST: The Valence-Assessing Semantics Test for Contextualizing Language
  Models
VAST: The Valence-Assessing Semantics Test for Contextualizing Language Models
Robert Wolfe
Aylin Caliskan
62
13
0
14 Mar 2022
Switch Trajectory Transformer with Distributional Value Approximation
  for Multi-Task Reinforcement Learning
Switch Trajectory Transformer with Distributional Value Approximation for Multi-Task Reinforcement Learning
Qinjie Lin
Han Liu
B. Sengupta
OffRL
72
12
0
14 Mar 2022
A Novel Perspective to Look At Attention: Bi-level Attention-based
  Explainable Topic Modeling for News Classification
A Novel Perspective to Look At Attention: Bi-level Attention-based Explainable Topic Modeling for News Classification
Dairui Liu
Derek Greene
Ruihai Dong
62
12
0
14 Mar 2022
WCL-BBCD: A Contrastive Learning and Knowledge Graph Approach to Named
  Entity Recognition
WCL-BBCD: A Contrastive Learning and Knowledge Graph Approach to Named Entity Recognition
Renjie Zhou
Qian Hu
Jian Wan
Jilin Zhang
Qiang Liu
Tianxiang Hu
Jian Li
59
3
0
14 Mar 2022
PERT: Pre-training BERT with Permuted Language Model
PERT: Pre-training BERT with Permuted Language Model
Yiming Cui
Ziqing Yang
Ting Liu
85
37
0
14 Mar 2022
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark
  for Semantic and Generative Capabilities
SUPERB-SG: Enhanced Speech processing Universal PERformance Benchmark for Semantic and Generative Capabilities
Hsiang-Sheng Tsai
Heng-Jui Chang
Wen-Chin Huang
Zili Huang
Kushal Lakhotia
...
Hsuan-Jui Chen
Shang-Wen Li
Shinji Watanabe
Abdel-rahman Mohamed
Hung-yi Lee
91
110
0
14 Mar 2022
Can pre-trained Transformers be used in detecting complex sensitive
  sentences? -- A Monsanto case study
Can pre-trained Transformers be used in detecting complex sensitive sentences? -- A Monsanto case study
Roelien C. Timmer
David Liebowitz
Surya Nepal
S. Kanhere
50
8
0
14 Mar 2022
SciNLI: A Corpus for Natural Language Inference on Scientific Text
SciNLI: A Corpus for Natural Language Inference on Scientific Text
Mobashir Sadat
Cornelia Caragea
AILaw
89
37
0
13 Mar 2022
Information retrieval for label noise document ranking by bag sampling
  and group-wise loss
Information retrieval for label noise document ranking by bag sampling and group-wise loss
Chunyuan Li
Jiajia Ding
Xing Hu
Fan Wang
RALM
28
0
0
12 Mar 2022
Survey on Automated Short Answer Grading with Deep Learning: from Word
  Embeddings to Transformers
Survey on Automated Short Answer Grading with Deep Learning: from Word Embeddings to Transformers
Stefan Haller
Adina Aldea
C. Seifert
N. Strisciuglio
67
39
0
11 Mar 2022
A comparative study of non-deep learning, deep learning, and ensemble
  learning methods for sunspot number prediction
A comparative study of non-deep learning, deep learning, and ensemble learning methods for sunspot number prediction
Yuchen Dang
Ziqi Chen
Heng Li
Hai Shu
ELMBDL
52
26
0
11 Mar 2022
PETR: Position Embedding Transformation for Multi-View 3D Object
  Detection
PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Yingfei Liu
Tiancai Wang
Xinming Zhang
Jian Sun
3DPC
148
554
0
10 Mar 2022
HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural
  Language Processing
HealthPrompt: A Zero-shot Learning Paradigm for Clinical Natural Language Processing
Sonish Sivarajkumar
Yanshan Wang
VLMLM&MA
103
58
0
09 Mar 2022
Towards Inadequately Pre-trained Models in Transfer Learning
Towards Inadequately Pre-trained Models in Transfer Learning
Andong Deng
Xingjian Li
Di Hu
Tianyang Wang
Haoyi Xiong
Chengzhong Xu
25
6
0
09 Mar 2022
DARER: Dual-task Temporal Relational Recurrent Reasoning Network for
  Joint Dialog Sentiment Classification and Act Recognition
DARER: Dual-task Temporal Relational Recurrent Reasoning Network for Joint Dialog Sentiment Classification and Act Recognition
Bowen Xing
Ivor W. Tsang
39
19
0
08 Mar 2022
Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval
Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval
Dingkun Long
Qiong Gao
Kuan-sheng Zou
Guangwei Xu
Pengjun Xie
Rui Guo
Jianfeng Xu
Guanjun Jiang
Luxi Xing
P. Yang
96
23
0
07 Mar 2022
ILDAE: Instance-Level Difficulty Analysis of Evaluation Data
ILDAE: Instance-Level Difficulty Analysis of Evaluation Data
Neeraj Varshney
Swaroop Mishra
Chitta Baral
69
19
0
07 Mar 2022
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for
  Temporal Sentence Grounding
Exploring Optical-Flow-Guided Motion and Detection-Based Appearance for Temporal Sentence Grounding
Daizong Liu
Xiang Fang
Wei Hu
Pan Zhou
98
37
0
06 Mar 2022
IISERB Brains at SemEval 2022 Task 6: A Deep-learning Framework to
  Identify Intended Sarcasm in English
IISERB Brains at SemEval 2022 Task 6: A Deep-learning Framework to Identify Intended Sarcasm in English
Tanuj Singh Shekhawat
M. Kumar
Udaybhan Rathore
Aditya Joshi
Jasabanta Patro
53
3
0
04 Mar 2022
Improving Health Mentioning Classification of Tweets using Contrastive
  Adversarial Training
Improving Health Mentioning Classification of Tweets using Contrastive Adversarial Training
Pervaiz Iqbal Khan
Shoaib Ahmed Siddiqui
Imran Razzak
Andreas Dengel
Sheraz Ahmed
44
4
0
03 Mar 2022
A Simple Hash-Based Early Exiting Approach For Language Understanding
  and Generation
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
Tianxiang Sun
Xiangyang Liu
Wei-wei Zhu
Zhichao Geng
Lingling Wu
Yilong He
Yuan Ni
Guotong Xie
Xuanjing Huang
Xipeng Qiu
90
41
0
03 Mar 2022
Providing Insights for Open-Response Surveys via End-to-End
  Context-Aware Clustering
Providing Insights for Open-Response Surveys via End-to-End Context-Aware Clustering
S. Esmaeilzadeh
Brian Williams
Davood Shamsi
Onar Vikingstad
30
2
0
02 Mar 2022
Large-Scale Hate Speech Detection with Cross-Domain Transfer
Large-Scale Hate Speech Detection with Cross-Domain Transfer
Cagri Toraman
Furkan Şahinuç
E. Yilmaz
126
63
0
02 Mar 2022
Previous
123...313233...697071
Next