ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,521 papers shown
Title
On Representation Learning for Scientific News Articles Using
  Heterogeneous Knowledge Graphs
On Representation Learning for Scientific News Articles Using Heterogeneous Knowledge Graphs
Angelika Romanou
Panayiotis Smeros
Karl Aberer
25
2
0
12 Apr 2021
Evaluating Pre-Trained Models for User Feedback Analysis in Software
  Engineering: A Study on Classification of App-Reviews
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews
M. Hadi
Fatemeh H. Fard
63
33
0
12 Apr 2021
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
SpartQA: : A Textual Question Answering Benchmark for Spatial Reasoning
Roshanak Mirzaee
Hossein Rajaby Faghihi
Qiang Ning
Parisa Kordjmashidi
56
83
0
12 Apr 2021
Escaping the Big Data Paradigm with Compact Transformers
Escaping the Big Data Paradigm with Compact Transformers
Ali Hassani
Steven Walton
Nikhil Shah
Abulikemu Abuduweili
Jiachen Li
Humphrey Shi
156
465
0
12 Apr 2021
Learning to Remove: Towards Isotropic Pre-trained BERT Embedding
Learning to Remove: Towards Isotropic Pre-trained BERT Embedding
Y. Liang
Rui Cao
Jie Zheng
Jie Ren
Ling Gao
SSL
180
28
0
12 Apr 2021
Constructing Contrastive samples via Summarization for Text
  Classification with limited annotations
Constructing Contrastive samples via Summarization for Text Classification with limited annotations
Yangkai Du
Tengfei Ma
Lingfei Wu
Fangli Xu
Xuhong Zhang
Bo Long
S. Ji
43
10
0
11 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
81
71
0
11 Apr 2021
Non-autoregressive Transformer-based End-to-end ASR using BERT
Non-autoregressive Transformer-based End-to-end ASR using BERT
Fu-Hao Yu
Kuan-Yu Chen
59
23
0
10 Apr 2021
WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for
  Detecting Toxic Spans
WLV-RIT at SemEval-2021 Task 5: A Neural Transformer Framework for Detecting Toxic Spans
Tharindu Ranasinghe
Diptanu Sarkar
Marcos Zampieri
Alexander Ororbia
MedIm
60
13
0
09 Apr 2021
AdCOFE: Advanced Contextual Feature Extraction in Conversations for
  emotion classification
AdCOFE: Advanced Contextual Feature Extraction in Conversations for emotion classification
Vaibhav Bhat
Anita Yadav
Sonal Yadav
Dhivya Chandrasekaran
Vijay K. Mago
42
5
0
09 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using
  Megatron-LM
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
235
716
0
09 Apr 2021
Larger-Context Tagging: When and Why Does It Work?
Larger-Context Tagging: When and Why Does It Work?
Jinlan Fu
Liangjing Feng
Qi Zhang
Xuanjing Huang
Pengfei Liu
61
5
0
09 Apr 2021
Transformers: "The End of History" for NLP?
Transformers: "The End of History" for NLP?
Anton Chernyavskiy
Dmitry Ilvovsky
Preslav Nakov
117
30
0
09 Apr 2021
HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep
  Learning Benchmarks
HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks
Firoj Alam
U. Qazi
Muhammad Imran
Ferda Ofli
80
67
0
07 Apr 2021
CodeTrans: Towards Cracking the Language of Silicon's Code Through
  Self-Supervised Deep Learning and High Performance Computing
CodeTrans: Towards Cracking the Language of Silicon's Code Through Self-Supervised Deep Learning and High Performance Computing
Ahmed Elnaggar
Wei Ding
Llion Jones
Tom Gibbs
Tamas B. Fehér
Christoph Angerer
Silvia Severini
Florian Matthes
B. Rost
72
72
0
06 Apr 2021
Semantic Distance: A New Metric for ASR Performance Analysis Towards
  Spoken Language Understanding
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
Suyoun Kim
Abhinav Arora
Duc Le
Ching-Feng Yeh
Christian Fuegen
Ozlem Kalinli
M. Seltzer
75
28
0
05 Apr 2021
Exploring Transformers in Emotion Recognition: a comparison of BERT,
  DistillBERT, RoBERTa, XLNet and ELECTRA
Exploring Transformers in Emotion Recognition: a comparison of BERT, DistillBERT, RoBERTa, XLNet and ELECTRA
Diogo Cortiz
38
38
0
05 Apr 2021
A Heuristic-driven Uncertainty based Ensemble Framework for Fake News
  Detection in Tweets and News Articles
A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles
Sourya Dipta Das
Ayan Basak
S. Dutta
82
50
0
05 Apr 2021
A New Approach to Overgenerating and Scoring Abstractive Summaries
A New Approach to Overgenerating and Scoring Abstractive Summaries
Kaiqiang Song
Bingqing Wang
Z. Feng
Fei Liu
110
17
0
05 Apr 2021
Recommending Metamodel Concepts during Modeling Activities with
  Pre-Trained Language Models
Recommending Metamodel Concepts during Modeling Activities with Pre-Trained Language Models
Martin Weyssow
H. Sahraoui
Eugene Syriani
82
54
0
04 Apr 2021
IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice
  Loss for Toxic Spans Detection
IITK@Detox at SemEval-2021 Task 5: Semi-Supervised Learning and Dice Loss for Toxic Spans Detection
Archit Bansal
Abhay Kaushik
Ashutosh Modi
48
4
0
04 Apr 2021
SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory
  Prediction
SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction
Liushuai Shi
Le Wang
Chengjiang Long
Sanping Zhou
Mo Zhou
Zhenxing Niu
G. Hua
103
237
0
04 Apr 2021
Exploring the Role of BERT Token Representations to Explain Sentence
  Probing Results
Exploring the Role of BERT Token Representations to Explain Sentence Probing Results
Hosein Mohebbi
Ali Modarressi
Mohammad Taher Pilehvar
MILM
67
26
0
03 Apr 2021
Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying
  Humor and Offensiveness
Humor@IITK at SemEval-2021 Task 7: Large Language Models for Quantifying Humor and Offensiveness
Aishwarya Gupta
Avik Pal
Bholeshwar Khurana
Lakshay Tyagi
Ashutosh Modi
53
6
0
02 Apr 2021
Low-Resource Language Modelling of South African Languages
Low-Resource Language Modelling of South African Languages
Stuart Mesham
Luc Hayward
Jared Shapiro
Jan Buys
73
16
0
01 Apr 2021
Motion Guided Attention Fusion to Recognize Interactions from Videos
Motion Guided Attention Fusion to Recognize Interactions from Videos
Tae Soo Kim
Jonathan D. Jones
Gregory Hager
43
15
0
01 Apr 2021
Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples
  for Relation Extraction
Normal vs. Adversarial: Salience-based Analysis of Adversarial Samples for Relation Extraction
Luoqiu Li
Xiang Chen
Zhen Bi
Xin Xie
Shumin Deng
Ningyu Zhang
Chuanqi Tan
Mosha Chen
Huajun Chen
AAML
112
7
0
01 Apr 2021
Dual Contrastive Loss and Attention for GANs
Dual Contrastive Loss and Attention for GANs
Ning Yu
Guilin Liu
Aysegül Dündar
Andrew Tao
Bryan Catanzaro
Larry S. Davis
Mario Fritz
GAN
133
61
0
31 Mar 2021
Pre-training for low resource speech-to-intent applications
Pre-training for low resource speech-to-intent applications
Pu Wang
Hugo Van hamme
45
4
0
30 Mar 2021
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Kaleido-BERT: Vision-Language Pre-training on Fashion Domain
Mingchen Zhuge
D. Gao
Deng-Ping Fan
Linbo Jin
Ben Chen
Hao Zhou
Minghui Qiu
Ling Shao
VLM
101
121
0
30 Mar 2021
Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays
Self-supervised Image-text Pre-training With Mixed Data In Chest X-rays
Xiaosong Wang
Ziyue Xu
Leo K. Tam
Dong Yang
Daguang Xu
ViTMedIm
70
24
0
30 Mar 2021
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal
  Dependencies
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies
P. Jayarao
Arpit Sharma
64
2
0
29 Mar 2021
On the Adversarial Robustness of Vision Transformers
On the Adversarial Robustness of Vision Transformers
Rulin Shao
Zhouxing Shi
Jinfeng Yi
Pin-Yu Chen
Cho-Jui Hsieh
ViT
115
146
0
29 Mar 2021
Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability
  of the Embedding Layers in NLP Models
Be Careful about Poisoned Word Embeddings: Exploring the Vulnerability of the Embedding Layers in NLP Models
Wenkai Yang
Lei Li
Zhiyuan Zhang
Xuancheng Ren
Xu Sun
Bin He
SILM
111
153
0
29 Mar 2021
Efficient Explanations from Empirical Explainers
Efficient Explanations from Empirical Explainers
Robert Schwarzenberg
Nils Feldhus
Sebastian Möller
FAtt
93
9
0
29 Mar 2021
Machine Learning Meets Natural Language Processing -- The story so far
Machine Learning Meets Natural Language Processing -- The story so far
N. Galanis
P. Vafiadis
K.-G. Mirzaev
G. Papakostas
85
7
0
27 Mar 2021
Unsupervised Document Embedding via Contrastive Augmentation
Unsupervised Document Embedding via Contrastive Augmentation
Dongsheng Luo
Wei Cheng
Jingchao Ni
Wenchao Yu
Xuchao Zhang
...
Yanchi Liu
Zhengzhang Chen
Dongjin Song
Haifeng Chen
Xiang Zhang
SSL
67
12
0
26 Mar 2021
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent
  Forecasting
AgentFormer: Agent-Aware Transformers for Socio-Temporal Multi-Agent Forecasting
Ye Yuan
Xinshuo Weng
Yanglan Ou
Kris Kitani
AI4TS
117
461
0
25 Mar 2021
K-XLNet: A General Method for Combining Explicit Knowledge with Language
  Model Pretraining
K-XLNet: A General Method for Combining Explicit Knowledge with Language Model Pretraining
Rui Yan
Lanchang Sun
Fang Wang
Xiaoming Zhang
KELMAI4CE
43
1
0
25 Mar 2021
FastMoE: A Fast Mixture-of-Expert Training System
FastMoE: A Fast Mixture-of-Expert Training System
Jiaao He
J. Qiu
Aohan Zeng
Zhilin Yang
Jidong Zhai
Jie Tang
ALMMoE
112
104
0
24 Mar 2021
Czert -- Czech BERT-like Model for Language Representation
Czert -- Czech BERT-like Model for Language Representation
Jakub Sido
O. Pražák
P. Pribán
Jan Pasek
Michal Seják
Miloslav Konopík
76
44
0
24 Mar 2021
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New
  Multitask Benchmark
UNICORN on RAINBOW: A Universal Commonsense Reasoning Model on a New Multitask Benchmark
Nicholas Lourie
Ronan Le Bras
Chandra Bhagavatula
Yejin Choi
LRM
108
140
0
24 Mar 2021
Complex Factoid Question Answering with a Free-Text Knowledge Graph
Complex Factoid Question Answering with a Free-Text Knowledge Graph
Chen Zhao
Chenyan Xiong
Xin Qian
Jordan L. Boyd-Graber
79
38
0
23 Mar 2021
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning
  Architectures
The NLP Cookbook: Modern Recipes for Transformer based Deep Learning Architectures
Sushant Singh
A. Mahmood
AI4TS
120
96
0
23 Mar 2021
Variable Name Recovery in Decompiled Binary Code using Constrained
  Masked Language Modeling
Variable Name Recovery in Decompiled Binary Code using Constrained Masked Language Modeling
Pratyay Banerjee
Kuntal Kumar Pal
Fish Wang
Chitta Baral
57
13
0
23 Mar 2021
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers
SelfExplain: A Self-Explaining Architecture for Neural Text Classifiers
Dheeraj Rajagopal
Vidhisha Balachandran
Eduard H. Hovy
Yulia Tsvetkov
MILMSSLFAttAI4TS
90
67
0
23 Mar 2021
Tiny Transformers for Environmental Sound Classification at the Edge
Tiny Transformers for Environmental Sound Classification at the Edge
David Elliott
Carlos E. Otero
Steven Wyatt
Evan Martino
81
16
0
22 Mar 2021
BERT: A Review of Applications in Natural Language Processing and
  Understanding
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
VLM
136
226
0
22 Mar 2021
Identifying Machine-Paraphrased Plagiarism
Identifying Machine-Paraphrased Plagiarism
Jan Philip Wahle
Terry Ruas
Tomávs Foltýnek
Norman Meuschke
Bela Gipp
89
32
0
22 Mar 2021
Grey-box Adversarial Attack And Defence For Sentiment Classification
Grey-box Adversarial Attack And Defence For Sentiment Classification
Ying Xu
Xu Zhong
Antonio Jimeno Yepes
Jey Han Lau
VLMAAML
70
54
0
22 Mar 2021
Previous
123...464748...697071
Next