ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXivPDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 1,487 papers shown
Title
Retrieval-Augmented Transformer-XL for Close-Domain Dialog Generation
Retrieval-Augmented Transformer-XL for Close-Domain Dialog Generation
Giovanni Bonetta
R. Cancelliere
Ding Liu
Paul Vozila
RALM
24
16
0
19 May 2021
CoTexT: Multi-task Learning with Code-Text Transformer
CoTexT: Multi-task Learning with Code-Text Transformer
Long Phan
H. Tran
Daniel Le
Hieu Duy Nguyen
J. Anibal
Alec Peltekian
Yanfang Ye
27
135
0
18 May 2021
Parallel Attention Network with Sequence Matching for Video Grounding
Parallel Attention Network with Sequence Matching for Video Grounding
Hao Zhang
Aixin Sun
Wei Jing
Liangli Zhen
Qiufeng Wang
Rick Siow Mong Goh
23
40
0
18 May 2021
Link Prediction on N-ary Relational Facts: A Graph-based Approach
Link Prediction on N-ary Relational Facts: A Graph-based Approach
Quan Wang
Haifeng Wang
Yajuan Lyu
Yong Zhu
24
46
0
18 May 2021
Divide and Contrast: Self-supervised Learning from Uncurated Data
Divide and Contrast: Self-supervised Learning from Uncurated Data
Yonglong Tian
Olivier J. Hénaff
Aaron van den Oord
SSL
69
96
0
17 May 2021
Pay Attention to MLPs
Pay Attention to MLPs
Hanxiao Liu
Zihang Dai
David R. So
Quoc V. Le
AI4CE
57
654
0
17 May 2021
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
K. Xuan
Yongbo Wang
Yongliang Wang
Zujie Wen
Yang Dong
VLM
38
52
0
17 May 2021
How is BERT surprised? Layerwise detection of linguistic anomalies
How is BERT surprised? Layerwise detection of linguistic anomalies
Bai Li
Zining Zhu
Guillaume Thomas
Yang Xu
Frank Rudzicz
27
31
0
16 May 2021
BERT Busters: Outlier Dimensions that Disrupt Transformers
BERT Busters: Outlier Dimensions that Disrupt Transformers
Olga Kovaleva
Saurabh Kulshreshtha
Anna Rogers
Anna Rumshisky
27
85
0
14 May 2021
Designing Multimodal Datasets for NLP Challenges
Designing Multimodal Datasets for NLP Challenges
James Pustejovsky
E. Holderness
Jingxuan Tu
Parker Glenn
Kyeongmin Rim
Kelley Lynch
R. Brutti
31
5
0
12 May 2021
Addressing "Documentation Debt" in Machine Learning Research: A
  Retrospective Datasheet for BookCorpus
Addressing "Documentation Debt" in Machine Learning Research: A Retrospective Datasheet for BookCorpus
Jack Bandy
Nicholas Vincent
29
57
0
11 May 2021
REPT: Bridging Language Models and Machine Reading Comprehension via
  Retrieval-Based Pre-training
REPT: Bridging Language Models and Machine Reading Comprehension via Retrieval-Based Pre-training
Fangkai Jiao
Yangyang Guo
Yilin Niu
Feng Ji
Feng-Lin Li
Liqiang Nie
LRM
34
12
0
10 May 2021
Logic-Driven Context Extension and Data Augmentation for Logical
  Reasoning of Text
Logic-Driven Context Extension and Data Augmentation for Logical Reasoning of Text
Siyuan Wang
Wanjun Zhong
Duyu Tang
Zhongyu Wei
Zhihao Fan
Daxin Jiang
Ming Zhou
Nan Duan
NAI
31
70
0
08 May 2021
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP:
  The Role of Sample Size and Dimensionality
Empirical Evaluation of Pre-trained Transformers for Human-Level NLP: The Role of Sample Size and Dimensionality
Adithya Ganesan
Matthew Matero
Aravind Reddy Ravula
Huy-Hien Vu
H. Andrew Schwartz
30
35
0
07 May 2021
Security Vulnerability Detection Using Deep Learning Natural Language
  Processing
Security Vulnerability Detection Using Deep Learning Natural Language Processing
Noah Ziems
Shaoen Wu
24
55
0
06 May 2021
Beyond Self-attention: External Attention using Two Linear Layers for
  Visual Tasks
Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks
Meng-Hao Guo
Zheng-Ning Liu
Tai-Jiang Mu
Shimin Hu
28
473
0
05 May 2021
MathBERT: A Pre-Trained Model for Mathematical Formula Understanding
MathBERT: A Pre-Trained Model for Mathematical Formula Understanding
Shuai Peng
Ke Yuan
Liangcai Gao
Zhi Tang
AIMat
54
107
0
02 May 2021
When to Foldém: How to answer Unanswerable questions
When to Foldém: How to answer Unanswerable questions
Marshall Ho
Zhipeng Zhou
J. He
36
2
0
01 May 2021
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale
  Place Recognition
SVT-Net: Super Light-Weight Sparse Voxel Transformer for Large Scale Place Recognition
Zhaoxin Fan
Zhenbo Song
Hongyan Liu
Zhiwu Lu
Jun He
Xiaoyong Du
3DPC
ViT
16
74
0
01 May 2021
Inpainting Transformer for Anomaly Detection
Inpainting Transformer for Anomaly Detection
Jonathan Pirnay
K. Chai
ViT
107
165
0
28 Apr 2021
PanGu-$α$: Large-scale Autoregressive Pretrained Chinese Language
  Models with Auto-parallel Computation
PanGu-ααα: Large-scale Autoregressive Pretrained Chinese Language Models with Auto-parallel Computation
Wei Zeng
Xiaozhe Ren
Teng Su
Hui Wang
Yi-Lun Liao
...
Gaojun Fan
Yaowei Wang
Xuefeng Jin
Qun Liu
Yonghong Tian
ALM
MoE
AI4CE
35
212
0
26 Apr 2021
Diverse Image Inpainting with Bidirectional and Autoregressive
  Transformers
Diverse Image Inpainting with Bidirectional and Autoregressive Transformers
Yingchen Yu
Fangneng Zhan
Rongliang Wu
Jianxiong Pan
Kaiwen Cui
Shijian Lu
Feiying Ma
Xuansong Xie
Chunyan Miao
ViT
45
150
0
26 Apr 2021
Literature review on vulnerability detection using NLP technology
Literature review on vulnerability detection using NLP technology
Jiajie Wu
44
14
0
23 Apr 2021
Framing Unpacked: A Semi-Supervised Interpretable Multi-View Model of
  Media Frames
Framing Unpacked: A Semi-Supervised Interpretable Multi-View Model of Media Frames
Shima Khanehzar
Trevor Cohn
Gosia Mikołajczak
A. Turpin
Lea Frermann
22
11
0
22 Apr 2021
A Short Survey of Pre-trained Language Models for Conversational AI-A
  NewAge in NLP
A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP
Munazza Zaib
Quan Z. Sheng
W. Zhang
24
67
0
22 Apr 2021
Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense
  Reasoning Tasks
Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense Reasoning Tasks
Lisa Bauer
Mohit Bansal
19
19
0
20 Apr 2021
Enhancing Cognitive Models of Emotions with Representation Learning
Enhancing Cognitive Models of Emotions with Representation Learning
Yuting Guo
Jinho Choi
33
5
0
20 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
46
2,204
0
20 Apr 2021
Training Value-Aligned Reinforcement Learning Agents Using a Normative
  Prior
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior
Md Sultan al Nahian
Spencer Frazier
Brent Harrison
Mark O. Riedl
29
18
0
19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal
  Pre-Training
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
30
41
0
19 Apr 2021
A novel time-frequency Transformer based on self-attention mechanism and
  its application in fault diagnosis of rolling bearings
A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings
Yifei Ding
M. Jia
Qiuhua Miao
Yudong Cao
16
268
0
19 Apr 2021
Emotion-Regularized Conditional Variational Autoencoder for Emotional
  Response Generation
Emotion-Regularized Conditional Variational Autoencoder for Emotional Response Generation
Yu-Ping Ruan
Zhenhua Ling
DRL
27
16
0
18 Apr 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
281
1,124
0
18 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for
  Free-form Text Generation
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
228
144
0
18 Apr 2021
"Average" Approximates "First Principal Component"? An Empirical
  Analysis on Representations from Neural Language Models
"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models
Zihan Wang
Chengyu Dong
Jingbo Shang
FAtt
42
4
0
18 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language
  Models
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MA
MedIm
31
164
0
16 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
106
230
0
15 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of
  Hierarchical Phrase Structure in Pretrained Language Models
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
Matteo Alleman
J. Mamou
Miguel Rio
Hanlin Tang
Yoon Kim
SueYeon Chung
NAI
46
17
0
15 Apr 2021
Hierarchical Learning for Generation with Long Source Sequences
Hierarchical Learning for Generation with Long Source Sequences
T. Rohde
Xiaoxia Wu
Yinhan Liu
BDL
VLM
25
56
0
15 Apr 2021
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural
  Language Understanding and Generation in E-Commerce
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
Song Xu
Haoran Li
Peng Yuan
Yujia Wang
Youzheng Wu
Xiaodong He
Ying Liu
Bowen Zhou
KELM
38
24
0
14 Apr 2021
AR-LSAT: Investigating Analytical Reasoning of Text
AR-LSAT: Investigating Analytical Reasoning of Text
Wanjun Zhong
Siyuan Wang
Duyu Tang
Zenan Xu
Daya Guo
Jiahai Wang
Jian Yin
Ming Zhou
Nan Duan
ELM
27
41
0
14 Apr 2021
Developing a Conversational Recommendation System for Navigating Limited
  Options
Developing a Conversational Recommendation System for Navigating Limited Options
Victor S. Bursztyn
Jennifer Healey
Eunyee Koh
Nedim Lipka
Larry Birnbaum
25
7
0
13 Apr 2021
Evaluating Pre-Trained Models for User Feedback Analysis in Software
  Engineering: A Study on Classification of App-Reviews
Evaluating Pre-Trained Models for User Feedback Analysis in Software Engineering: A Study on Classification of App-Reviews
M. Hadi
Fatemeh H. Fard
26
30
0
12 Apr 2021
Escaping the Big Data Paradigm with Compact Transformers
Escaping the Big Data Paradigm with Compact Transformers
Ali Hassani
Steven Walton
Nikhil Shah
Abulikemu Abuduweili
Jiachen Li
Humphrey Shi
79
462
0
12 Apr 2021
A Deep Learning Based Cost Model for Automatic Code Optimization
A Deep Learning Based Cost Model for Automatic Code Optimization
Riyadh Baghdadi
Massinissa Merouani
Mohamed-Hicham Leghettas
K. Abdous
T. Arbaoui
K. Benatchba
Saman P. Amarasinghe
35
68
0
11 Apr 2021
Efficient Large-Scale Language Model Training on GPU Clusters Using
  Megatron-LM
Efficient Large-Scale Language Model Training on GPU Clusters Using Megatron-LM
Deepak Narayanan
Mohammad Shoeybi
Jared Casper
P. LeGresley
M. Patwary
...
Prethvi Kashinkunti
J. Bernauer
Bryan Catanzaro
Amar Phanishayee
Matei A. Zaharia
MoE
37
656
0
09 Apr 2021
HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep
  Learning Benchmarks
HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks
Firoj Alam
U. Qazi
Muhammad Imran
Ferda Ofli
25
65
0
07 Apr 2021
Semantic Distance: A New Metric for ASR Performance Analysis Towards
  Spoken Language Understanding
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding
Suyoun Kim
Abhinav Arora
Duc Le
Ching-Feng Yeh
Christian Fuegen
Ozlem Kalinli
M. Seltzer
28
25
0
05 Apr 2021
A Heuristic-driven Uncertainty based Ensemble Framework for Fake News
  Detection in Tweets and News Articles
A Heuristic-driven Uncertainty based Ensemble Framework for Fake News Detection in Tweets and News Articles
Sourya Dipta Das
Ayan Basak
S. Dutta
47
47
0
05 Apr 2021
A New Approach to Overgenerating and Scoring Abstractive Summaries
A New Approach to Overgenerating and Scoring Abstractive Summaries
Kaiqiang Song
Bingqing Wang
Z. Feng
Fei Liu
22
17
0
05 Apr 2021
Previous
123...181920...282930
Next