ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,524 papers shown
Title
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Interactive Machine Comprehension with Dynamic Knowledge Graphs
Xingdi Yuan
93
3
0
31 Aug 2021
Thermostat: A Large Collection of NLP Model Explanations and Analysis
  Tools
Thermostat: A Large Collection of NLP Model Explanations and Analysis Tools
Nils Feldhus
Robert Schwarzenberg
Sebastian Möller
123
14
0
31 Aug 2021
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Backdoor Attacks on Pre-trained Models by Layerwise Weight Poisoning
Linyang Li
Demin Song
Xiaonan Li
Jiehang Zeng
Ruotian Ma
Xipeng Qiu
147
141
0
31 Aug 2021
Automated Mining of Leaderboards for Empirical AI Research
Automated Mining of Leaderboards for Empirical AI Research
Salomon Kabongo KABENAMUALU
Jennifer D'Souza
Sören Auer
115
31
0
31 Aug 2021
Improving Multimodal fusion via Mutual Dependency Maximisation
Improving Multimodal fusion via Mutual Dependency Maximisation
Pierre Colombo
E. Chapuis
Matthieu Labeau
Chloé Clavel
182
31
0
31 Aug 2021
How Does Adversarial Fine-Tuning Benefit BERT?
How Does Adversarial Fine-Tuning Benefit BERT?
J. Ebrahimi
Hao Yang
Wei Zhang
AAML
58
4
0
31 Aug 2021
N24News: A New Dataset for Multimodal News Classification
N24News: A New Dataset for Multimodal News Classification
Zhen Wang
Xu Shan
Xiangxie Zhang
Jie Yang
VLM
108
38
0
30 Aug 2021
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language
  Understanding
ASR-GLUE: A New Multi-task Benchmark for ASR-Robust Natural Language Understanding
Lingyun Feng
Jianwei Yu
Deng Cai
Songxiang Liu
Haitao Zheng
Yan Wang
ELM
186
14
0
30 Aug 2021
Evaluating Bayes Error Estimators on Real-World Datasets with FeeBee
Evaluating Bayes Error Estimators on Real-World Datasets with FeeBee
Cédric Renggli
Luka Rimanic
Nora Hollenstein
Ce Zhang
34
12
0
30 Aug 2021
Shatter: An Efficient Transformer Encoder with Single-Headed
  Self-Attention and Relative Sequence Partitioning
Shatter: An Efficient Transformer Encoder with Single-Headed Self-Attention and Relative Sequence Partitioning
Ran Tian
Joshua Maynez
Ankur P. Parikh
ViT
56
2
0
30 Aug 2021
Are Training Resources Insufficient? Predict First Then Explain!
Are Training Resources Insufficient? Predict First Then Explain!
Myeongjun Jang
Thomas Lukasiewicz
LRM
73
7
0
29 Aug 2021
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task
  Models
NoiER: An Approach for Training more Reliable Fine-TunedDownstream Task Models
Myeongjun Jang
Thomas Lukasiewicz
67
4
0
29 Aug 2021
Code-switched inspired losses for generic spoken dialog representations
Code-switched inspired losses for generic spoken dialog representations
E. Chapuis
Pierre Colombo
Matthieu Labeau
Chloe Clave
177
12
0
27 Aug 2021
Evaluating the Robustness of Neural Language Models to Input
  Perturbations
Evaluating the Robustness of Neural Language Models to Input Perturbations
M. Moradi
Matthias Samwald
AAML
101
102
0
27 Aug 2021
Exploring the Capacity of a Large-scale Masked Language Model to
  Recognize Grammatical Errors
Exploring the Capacity of a Large-scale Masked Language Model to Recognize Grammatical Errors
Ryo Nagata
Manabu Kimura
Kazuaki Hanawa
32
5
0
27 Aug 2021
EmoBERTa: Speaker-Aware Emotion Recognition in Conversation with RoBERTa
EmoBERTa: Speaker-Aware Emotion Recognition in Conversation with RoBERTa
Taewoon Kim
Piek Vossen
98
102
0
26 Aug 2021
Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral
Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral
Lucio Dery
Yann N. Dauphin
David Grangier
MoMe
79
29
0
25 Aug 2021
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang
Jiahui Yu
Adams Wei Yu
Zihang Dai
Yulia Tsvetkov
Yuan Cao
VLMMLLM
183
801
0
24 Aug 2021
sigmoidF1: A Smooth F1 Score Surrogate Loss for Multilabel
  Classification
sigmoidF1: A Smooth F1 Score Surrogate Loss for Multilabel Classification
Gabriel Bénédict
Vincent Koops
Daan Odijk
Maarten de Rijke
100
33
0
24 Aug 2021
Explaining Bayesian Neural Networks
Explaining Bayesian Neural Networks
Kirill Bykov
Marina M.-C. Höhne
Adelaida Creosteanu
Klaus-Robert Muller
Frederick Klauschen
Shinichi Nakajima
Marius Kloft
BDLAAML
72
25
0
23 Aug 2021
Regularizing Transformers With Deep Probabilistic Layers
Regularizing Transformers With Deep Probabilistic Layers
Aurora Cobo Aguilera
Pablo M. Olmos
Antonio Artés-Rodríguez
Fernando Pérez-Cruz
70
9
0
23 Aug 2021
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action
  Recognition
MM-ViT: Multi-Modal Video Transformer for Compressed Video Action Recognition
Jiawei Chen
C. Ho
ViT
101
78
0
20 Aug 2021
Knowledge Perceived Multi-modal Pretraining in E-commerce
Knowledge Perceived Multi-modal Pretraining in E-commerce
Yushan Zhu
Huaixiao Tou
Wen Zhang
Ganqiang Ye
Hui Chen
Ningyu Zhang
Huajun Chen
94
33
0
20 Aug 2021
SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with
  Structured Semantics for Medical Text Mining
SMedBERT: A Knowledge-Enhanced Pre-trained Language Model with Structured Semantics for Medical Text Mining
Taolin Zhang
Zerui Cai
Chengyu Wang
Minghui Qiu
Bite Yang
Xiaofeng He
AI4MH
73
54
0
20 Aug 2021
Detection of Illicit Drug Trafficking Events on Instagram: A Deep
  Multimodal Multilabel Learning Approach
Detection of Illicit Drug Trafficking Events on Instagram: A Deep Multimodal Multilabel Learning Approach
Chuanbo Hu
Minglei Yin
Bin Liu
Xin Li
Yanfang Ye
43
15
0
19 Aug 2021
DESYR: Definition and Syntactic Representation Based Claim Detection on
  the Web
DESYR: Definition and Syntactic Representation Based Claim Detection on the Web
Megha Sundriyal
Parantak Singh
Md. Shad Akhtar
Shubhashis Sengupta
Tanmoy Chakraborty
63
10
0
19 Aug 2021
QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query
  Attribute Value Extraction
QUEACO: Borrowing Treasures from Weakly-labeled Behavior Data for Query Attribute Value Extraction
Danqing Zhang
Zheng Li
Tianyu Cao
Chen Luo
Tony Wu
Hanqing Lu
Yiwei Song
Bing Yin
Tuo Zhao
Qiang Yang
79
20
0
19 Aug 2021
Modulating Language Models with Emotions
Modulating Language Models with Emotions
Ruibo Liu
Jason W. Wei
Chenyan Jia
Soroush Vosoughi
65
23
0
17 Aug 2021
MigrationsKB: A Knowledge Base of Public Attitudes towards Migrations
  and their Driving Factors
MigrationsKB: A Knowledge Base of Public Attitudes towards Migrations and their Driving Factors
Yiyi Chen
Harald Sack
Mehwish Alam
36
3
0
17 Aug 2021
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and
  Intra-modal Knowledge Integration
ROSITA: Enhancing Vision-and-Language Semantic Alignments via Cross- and Intra-modal Knowledge Integration
Yuhao Cui
Zhou Yu
Chunqi Wang
Zhongzhou Zhao
Ji Zhang
Meng Wang
Jun-chen Yu
VLM
77
56
0
16 Aug 2021
MUSIQ: Multi-scale Image Quality Transformer
MUSIQ: Multi-scale Image Quality Transformer
Junjie Ke
Qifei Wang
Yilin Wang
P. Milanfar
Feng Yang
249
690
0
12 Aug 2021
Modeling Relevance Ranking under the Pre-training and Fine-tuning
  Paradigm
Modeling Relevance Ranking under the Pre-training and Fine-tuning Paradigm
Lin Bo
Liang Pang
Gang Wang
Jun Xu
Xiuqiang He
Jirong Wen
48
4
0
12 Aug 2021
AMMUS : A Survey of Transformer-based Pretrained Models in Natural
  Language Processing
AMMUS : A Survey of Transformer-based Pretrained Models in Natural Language Processing
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
VLMLM&MA
113
270
0
12 Aug 2021
Unsupervised Corpus Aware Language Model Pre-training for Dense Passage
  Retrieval
Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval
Luyu Gao
Jamie Callan
RALM
301
342
0
12 Aug 2021
Variable-Length Music Score Infilling via XLNet and Musically
  Specialized Positional Encoding
Variable-Length Music Score Infilling via XLNet and Musically Specialized Positional Encoding
Chin-Jui Chang
Chun-Yi Lee
Yi-Hsuan Yang
71
21
0
11 Aug 2021
A Transformer-based Math Language Model for Handwritten Math Expression
  Recognition
A Transformer-based Math Language Model for Handwritten Math Expression Recognition
Quang Huy Ung
C. Nguyen
Hung Tuan Nguyen
Thanh-Nghia Truong
M. Nakagawa
19
9
0
11 Aug 2021
Differentiable Subset Pruning of Transformer Heads
Differentiable Subset Pruning of Transformer Heads
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
134
57
0
10 Aug 2021
Making Transformers Solve Compositional Tasks
Making Transformers Solve Compositional Tasks
Santiago Ontañón
Joshua Ainslie
Vaclav Cvicek
Zachary Kenneth Fisher
116
74
0
09 Aug 2021
Unifying Heterogeneous Electronic Health Records Systems via Text-Based
  Code Embedding
Unifying Heterogeneous Electronic Health Records Systems via Text-Based Code Embedding
Kyunghoon Hur
Jiyoung Lee
Jungwoo Oh
Wesley Price
Young-Hak Kim
Edward Choi
103
19
0
08 Aug 2021
Language Model Evaluation in Open-ended Text Generation
Language Model Evaluation in Open-ended Text Generation
An Nguyen
111
3
0
08 Aug 2021
LadRa-Net: Locally-Aware Dynamic Re-read Attention Net for Sentence
  Semantic Matching
LadRa-Net: Locally-Aware Dynamic Re-read Attention Net for Sentence Semantic Matching
Kun Zhang
Guangyi Lv
Le Wu
Enhong Chen
Qi Liu
Meng Wang
63
6
0
06 Aug 2021
Adaptive Residue-wise Profile Fusion for Low Homologous Protein
  SecondaryStructure Prediction Using External Knowledge
Adaptive Residue-wise Profile Fusion for Low Homologous Protein SecondaryStructure Prediction Using External Knowledge
Qin Wang
JunChao Wei
Boyuan Wang
Zhen Li
Sheng Wang
Shuguang Cui
51
1
0
05 Aug 2021
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field
  and Far-field Attention
FMMformer: Efficient and Flexible Transformer via Decomposed Near-field and Far-field Attention
T. Nguyen
Vai Suliafu
Stanley J. Osher
Long Chen
Bao Wang
72
36
0
05 Aug 2021
Boosting Few-shot Semantic Segmentation with Transformers
Boosting Few-shot Semantic Segmentation with Transformers
Guolei Sun
Yun-Hai Liu
Christos Sakaridis
Luc Van Gool
ViT
63
9
0
04 Aug 2021
Vision Transformer with Progressive Sampling
Vision Transformer with Progressive Sampling
Xiaoyu Yue
Shuyang Sun
Zhanghui Kuang
Meng Wei
Philip Torr
Wayne Zhang
Dahua Lin
ViT
91
85
0
03 Aug 2021
Exploiting BERT For Multimodal Target Sentiment Classification Through
  Input Space Translation
Exploiting BERT For Multimodal Target Sentiment Classification Through Input Space Translation
Zaid Khan
Y. Fu
81
140
0
03 Aug 2021
Musical Speech: A Transformer-based Composition Tool
Musical Speech: A Transformer-based Composition Tool
Jason dÉon
Sri Harsha Dumpala
Chandramouli Shama Sastry
Daniel Oore
Sageev Oore
64
1
0
02 Aug 2021
Polarity in the Classroom: A Case Study Leveraging Peer Sentiment Toward
  Scalable Assessment
Polarity in the Classroom: A Case Study Leveraging Peer Sentiment Toward Scalable Assessment
Zachariah J. Beasley
L. Piegl
Paul Rosen
40
3
0
02 Aug 2021
LICHEE: Improving Language Model Pre-training with Multi-grained
  Tokenization
LICHEE: Improving Language Model Pre-training with Multi-grained Tokenization
Weidong Guo
Mingjun Zhao
Lusheng Zhang
Di Niu
Jinwen Luo
Zhenhua Liu
Zhenyang Li
J. Tang
55
8
0
02 Aug 2021
Improving Social Meaning Detection with Pragmatic Masking and Surrogate
  Fine-Tuning
Improving Social Meaning Detection with Pragmatic Masking and Surrogate Fine-Tuning
Chiyu Zhang
Muhammad Abdul-Mageed
ObjDAI4CE
77
6
0
01 Aug 2021
Previous
123...394041...697071
Next