ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,520 papers shown
Title
WADER at SemEval-2023 Task 9: A Weak-labelling framework for Data
  augmentation in tExt Regression Tasks
WADER at SemEval-2023 Task 9: A Weak-labelling framework for Data augmentation in tExt Regression Tasks
Manan Suri
Aaryak Garg
Divya Chaudhary
I. Gorton
B. Kumar
49
1
0
05 Mar 2023
TrojText: Test-time Invisible Textual Trojan Insertion
TrojText: Test-time Invisible Textual Trojan Insertion
Qiang Lou
Ye Liu
Bo Feng
137
27
0
03 Mar 2023
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable
  Transformers
Sparse MoE as the New Dropout: Scaling Dense and Self-Slimmable Transformers
Tianlong Chen
Zhenyu Zhang
Ajay Jaiswal
Shiwei Liu
Zhangyang Wang
MoE
116
50
0
02 Mar 2023
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Can BERT Refrain from Forgetting on Sequential Tasks? A Probing Study
Mingxu Tao
Yansong Feng
Dongyan Zhao
CLLKELM
72
10
0
02 Mar 2023
How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language
  Understanding Tasks
How Robust is GPT-3.5 to Predecessors? A Comprehensive Study on Language Understanding Tasks
Xuanting Chen
Junjie Ye
Can Zu
Nuo Xu
Rui Zheng
Minlong Peng
Jie Zhou
Tao Gui
Qi Zhang
Xuanjing Huang
AI4MHELM
69
83
0
01 Mar 2023
H-AES: Towards Automated Essay Scoring for Hindi
H-AES: Towards Automated Essay Scoring for Hindi
Shubhankar K. Singh
Anirudh Pupneja
Shivaansh Mital
Cheril Shah
Manish Bawkar
Lakshman Prasad Gupta
Ajit Kumar
Yaman Kumar Singla
Rushali Gupta
R. Shah
73
7
0
28 Feb 2023
Sampled Transformer for Point Sets
Sampled Transformer for Point Sets
Shidi Li
Christian J. Walder
Alexander Soen
Lexing Xie
Miaomiao Liu
3DPC
72
1
0
28 Feb 2023
HugNLP: A Unified and Comprehensive Library for Natural Language
  Processing
HugNLP: A Unified and Comprehensive Library for Natural Language Processing
Jiadong Wang
Nuo Chen
Qiushi Sun
Wenkang Huang
Chengyu Wang
Ming Gao
71
4
0
28 Feb 2023
Full Stack Optimization of Transformer Inference: a Survey
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
163
106
0
27 Feb 2023
Systematic Rectification of Language Models via Dead-end Analysis
Systematic Rectification of Language Models via Dead-end Analysis
Mengyao Cao
Mehdi Fatemi
Jackie C.K. Cheung
Samira Shabanian
KELM
73
16
0
27 Feb 2023
Hulk: Graph Neural Networks for Optimizing Regionally Distributed
  Computing Systems
Hulk: Graph Neural Networks for Optimizing Regionally Distributed Computing Systems
Zheng Yuan
HU Xue
Chaoyun Zhang
Yongming Liu
GNNAI4CE
39
1
0
27 Feb 2023
Elementwise Language Representation
Elementwise Language Representation
Du-Yeong Kim
Jeeeun Kim
67
0
0
27 Feb 2023
HULAT at SemEval-2023 Task 10: Data augmentation for pre-trained transformers applied to the detection of sexism in social media
Isabel Segura-Bedmar
ViT
39
2
0
24 Feb 2023
Hiding Data Helps: On the Benefits of Masking for Sparse Coding
Hiding Data Helps: On the Benefits of Masking for Sparse Coding
Muthuraman Chidambaram
Chenwei Wu
Yu Cheng
Rong Ge
89
0
0
24 Feb 2023
Edgeformers: Graph-Empowered Transformers for Representation Learning on
  Textual-Edge Networks
Edgeformers: Graph-Empowered Transformers for Representation Learning on Textual-Edge Networks
Bowen Jin
Yu Zhang
Yu Meng
Jiawei Han
97
31
0
21 Feb 2023
Exploring the Limits of Transfer Learning with Unified Model in the
  Cybersecurity Domain
Exploring the Limits of Transfer Learning with Unified Model in the Cybersecurity Domain
Kuntal Kumar Pal
Kazuaki Kashihara
Ujjwala Anantheswaran
Kirby Kuznia
S. Jagtap
Chitta Baral
AAML
34
3
0
20 Feb 2023
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
VoxSRC 2022: The Fourth VoxCeleb Speaker Recognition Challenge
Jaesung Huh
A. Brown
Jee-weon Jung
Joon Son Chung
Arsha Nagrani
D. Garcia-Romero
Andrew Zisserman
106
26
0
20 Feb 2023
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Large-scale Multi-Modal Pre-trained Models: A Comprehensive Survey
Tianlin Li
Guangyao Chen
Guangwu Qian
Pengcheng Gao
Xiaoyong Wei
Yaowei Wang
Yonghong Tian
Wen Gao
AI4CEVLM
157
216
0
20 Feb 2023
Upvotes? Downvotes? No Votes? Understanding the relationship between
  reaction mechanisms and political discourse on Reddit
Upvotes? Downvotes? No Votes? Understanding the relationship between reaction mechanisms and political discourse on Reddit
Orestis Papakyriakopoulos
Severin Engelmann
Amy A. Winecoff
55
19
0
19 Feb 2023
Learning Language Representations with Logical Inductive Bias
Learning Language Representations with Logical Inductive Bias
Jianshu Chen
NAIAI4CELRM
53
3
0
19 Feb 2023
Multimodal Propaganda Processing
Multimodal Propaganda Processing
Vincent Ng
Shengjie Li
107
2
0
17 Feb 2023
Marich: A Query-efficient Distributionally Equivalent Model Extraction
  Attack using Public Data
Marich: A Query-efficient Distributionally Equivalent Model Extraction Attack using Public Data
Pratik Karmakar
D. Basu
MIACV
89
7
0
16 Feb 2023
Cluster-based Deep Ensemble Learning for Emotion Classification in
  Internet Memes
Cluster-based Deep Ensemble Learning for Emotion Classification in Internet Memes
Xiaoyu Guo
Jing Ma
A. Zubiaga
69
0
0
16 Feb 2023
Platform-Independent and Curriculum-Oriented Intelligent Assistant for
  Higher Education
Platform-Independent and Curriculum-Oriented Intelligent Assistant for Higher Education
Ramteja Sajja
Y. Sermet
David M. Cwiertny
Ibrahim Demir
60
68
0
15 Feb 2023
AbLit: A Resource for Analyzing and Generating Abridged Versions of
  English Literature
AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature
Melissa Roemmele
Kyle Shaffer
Katrina Olsen
Yiyi Wang
Steve DeNeefe
47
1
0
13 Feb 2023
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction
An Extended Sequence Tagging Vocabulary for Grammatical Error Correction
Stuart Mesham
Christopher Bryant
Marek Rei
Zheng Yuan
76
8
0
12 Feb 2023
TextDefense: Adversarial Text Detection based on Word Importance Entropy
TextDefense: Adversarial Text Detection based on Word Importance Entropy
Lujia Shen
Xuhong Zhang
S. Ji
Yuwen Pu
Chunpeng Ge
Xing Yang
Yanghe Feng
AAML
59
8
0
12 Feb 2023
Transformer models: an introduction and catalog
Transformer models: an introduction and catalog
X. Amatriain
Ananth Sankar
Jie Bing
Praveen Kumar Bodigutla
Timothy J. Hazen
Michaeel Kazi
133
53
0
12 Feb 2023
A Reparameterized Discrete Diffusion Model for Text Generation
A Reparameterized Discrete Diffusion Model for Text Generation
Lin Zheng
Jianbo Yuan
Lei Yu
Lingpeng Kong
DiffM
151
70
0
11 Feb 2023
In-Context Learning with Many Demonstration Examples
In-Context Learning with Many Demonstration Examples
Mukai Li
Shansan Gong
Jiangtao Feng
Yiheng Xu
Jinchao Zhang
Zhiyong Wu
Lingpeng Kong
111
38
0
09 Feb 2023
Zero-Shot Learning for Requirements Classification: An Exploratory Study
Zero-Shot Learning for Requirements Classification: An Exploratory Study
Waad Alhoshan
Alessio Ferrari
Liping Zhao
VLM
113
41
0
09 Feb 2023
A Large-Scale Analysis of Persian Tweets Regarding Covid-19 Vaccination
A Large-Scale Analysis of Persian Tweets Regarding Covid-19 Vaccination
Taha ShabaniMirzaei
Houmaan Chamani
Amirhossein Abaskohi
Zhivar Sourati Hassan Zadeh
B. Bahrak
27
1
0
09 Feb 2023
Training-free Lexical Backdoor Attacks on Language Models
Training-free Lexical Backdoor Attacks on Language Models
Yujin Huang
Terry Yue Zhuo
Xingliang Yuan
Han Hu
Lizhen Qu
Chunyang Chen
SILM
97
46
0
08 Feb 2023
Revisiting Offline Compression: Going Beyond Factorization-based Methods
  for Transformer Language Models
Revisiting Offline Compression: Going Beyond Factorization-based Methods for Transformer Language Models
Mohammadreza Banaei
Klaudia Bałazy
Artur Kasymov
R. Lebret
Jacek Tabor
Karl Aberer
OffRL
48
0
0
08 Feb 2023
EvoText: Enhancing Natural Language Generation Models via
  Self-Escalation Learning for Up-to-Date Knowledge and Improved Performance
EvoText: Enhancing Natural Language Generation Models via Self-Escalation Learning for Up-to-Date Knowledge and Improved Performance
Zheng Yuan
HU Xue
Chuxu Zhang
Yongming Liu
VLM
66
0
0
08 Feb 2023
The Effect of Metadata on Scientific Literature Tagging: A Cross-Field
  Cross-Model Study
The Effect of Metadata on Scientific Literature Tagging: A Cross-Field Cross-Model Study
Yu Zhang
Bowen Jin
Qi Zhu
Yu Meng
Jiawei Han
92
20
0
07 Feb 2023
Data Selection for Language Models via Importance Resampling
Data Selection for Language Models via Importance Resampling
Sang Michael Xie
Shibani Santurkar
Tengyu Ma
Percy Liang
131
196
0
06 Feb 2023
Findings of the TSAR-2022 Shared Task on Multilingual Lexical
  Simplification
Findings of the TSAR-2022 Shared Task on Multilingual Lexical Simplification
Horacio Saggion
S. vStajner
Daniel Ferrés
Kim Cheng Sheang
Matthew Shardlow
Kai North
Marcos Zampieri
58
51
0
06 Feb 2023
Computation vs. Communication Scaling for Future Transformers on Future
  Hardware
Computation vs. Communication Scaling for Future Transformers on Future Hardware
Suchita Pati
Shaizeen Aga
Mahzabeen Islam
Nuwan Jayasena
Matthew D. Sinclair
51
10
0
06 Feb 2023
Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is
  All You Need
Rethinking Out-of-distribution (OOD) Detection: Masked Image Modeling is All You Need
Jingyao Li
Pengguang Chen
Shaozuo Yu
Zexin He
Shu Liu
Jiaya Jia
OODD
102
46
0
06 Feb 2023
KDEformer: Accelerating Transformers via Kernel Density Estimation
KDEformer: Accelerating Transformers via Kernel Density Estimation
A. Zandieh
Insu Han
Majid Daliri
Amin Karbasi
125
47
0
05 Feb 2023
Knowledge Distillation in Vision Transformers: A Critical Review
Knowledge Distillation in Vision Transformers: A Critical Review
Gousia Habib
Tausifa Jan Saleem
Brejesh Lall
98
16
0
04 Feb 2023
Representation Deficiency in Masked Language Modeling
Representation Deficiency in Masked Language Modeling
Yu Meng
Jitin Krishnan
Sinong Wang
Qifan Wang
Yuning Mao
Han Fang
Marjan Ghazvininejad
Jiawei Han
Luke Zettlemoyer
149
7
0
04 Feb 2023
A Case Study for Compliance as Code with Graphs and Language Models:
  Public release of the Regulatory Knowledge Graph
A Case Study for Compliance as Code with Graphs and Language Models: Public release of the Regulatory Knowledge Graph
V. Ershov
22
5
0
03 Feb 2023
Bioformer: an efficient transformer language model for biomedical text
  mining
Bioformer: an efficient transformer language model for biomedical text mining
Li Fang
Qingyu Chen
Chih-Hsuan Wei
Zhiyong Lu
Kai Wang
MedImAI4CE
65
22
0
03 Feb 2023
Detecting Reddit Users with Depression Using a Hybrid Neural Network
  SBERT-CNN
Detecting Reddit Users with Depression Using a Hybrid Neural Network SBERT-CNN
Ziyi Chen
Ren Yang
S. Fu
Nansu Zong
Hongfang Liu
Ming Huang
AI4MH
41
14
0
03 Feb 2023
A Survey of Deep Learning: From Activations to Transformers
A Survey of Deep Learning: From Activations to Transformers
Johannes Schneider
Michalis Vlachos
ViTMedImAI4TSAI4CE
112
10
0
01 Feb 2023
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image
  and Video
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video
Haiyang Xu
Qinghao Ye
Mingshi Yan
Yaya Shi
Jiabo Ye
...
Guohai Xu
Ji Zhang
Songfang Huang
Feiran Huang
Jingren Zhou
MLLMVLMMoE
116
171
0
01 Feb 2023
Protein Representation Learning via Knowledge Enhanced Primary Structure
  Modeling
Protein Representation Learning via Knowledge Enhanced Primary Structure Modeling
Hong-Yu Zhou
Yunxiang Fu
Zhicheng Zhang
Cheng Bian
Yizhou Yu
90
8
0
30 Jan 2023
Response-act Guided Reinforced Dialogue Generation for Mental Health
  Counseling
Response-act Guided Reinforced Dialogue Generation for Mental Health Counseling
Aseem Srivastava
Ishan Pandey
Md. Shad Akhtar
Tanmoy Chakraborty
OffRL
75
13
0
30 Jan 2023
Previous
123...192021...697071
Next