ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,521 papers shown
Title
XLM-T: Scaling up Multilingual Machine Translation with Pretrained
  Cross-lingual Transformer Encoders
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders
Shuming Ma
Jian Yang
Haoyang Huang
Zewen Chi
Li Dong
...
Akiko Eriguchi
Saksham Singhal
Xia Song
Arul Menezes
Furu Wei
LRM
85
33
0
31 Dec 2020
BANG: Bridging Autoregressive and Non-autoregressive Generation with
  Large Scale Pretraining
BANG: Bridging Autoregressive and Non-autoregressive Generation with Large Scale Pretraining
Weizhen Qi
Yeyun Gong
Jian Jiao
Yu Yan
Weizhu Chen
...
Houqiang Li
Jiusheng Chen
Ruofei Zhang
Ming Zhou
Nan Duan
104
46
0
31 Dec 2020
Fast WordPiece Tokenization
Fast WordPiece Tokenization
Xinying Song
Alexandru Salcianu
Yang Song
Dave Dopson
Denny Zhou
109
166
0
31 Dec 2020
CLEAR: Contrastive Learning for Sentence Representation
CLEAR: Contrastive Learning for Sentence Representation
Zhuofeng Wu
Sinong Wang
Jiatao Gu
Madian Khabsa
Fei Sun
Hao Ma
SSL
82
324
0
31 Dec 2020
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Text-Free Image-to-Speech Synthesis Using Learned Segmental Units
Wei-Ning Hsu
David Harwath
Christopher Song
James R. Glass
CLIP
90
67
0
31 Dec 2020
Verb Knowledge Injection for Multilingual Event Processing
Verb Knowledge Injection for Multilingual Event Processing
Olga Majewska
Ivan Vulić
Goran Glavaš
Edoardo Ponti
Anna Korhonen
85
11
0
31 Dec 2020
An Experimental Evaluation of Transformer-based Language Models in the
  Biomedical Domain
An Experimental Evaluation of Transformer-based Language Models in the Biomedical Domain
Paul Grouchy
Shobhit Jain
Michael Liu
Kuhan Wang
Max Tian
Nidhi Arora
Hillary Ngai
Faiza Khan Khattak
Elham Dolatabadi
S. Kocak
LM&MAMedIm
115
4
0
31 Dec 2020
UNIMO: Towards Unified-Modal Understanding and Generation via
  Cross-Modal Contrastive Learning
UNIMO: Towards Unified-Modal Understanding and Generation via Cross-Modal Contrastive Learning
Wei Li
Can Gao
Guocheng Niu
Xinyan Xiao
Hao Liu
Jiachen Liu
Hua Wu
Haifeng Wang
148
382
0
31 Dec 2020
Optimizing Deeper Transformers on Small Datasets
Optimizing Deeper Transformers on Small Datasets
Peng Xu
Dhruv Kumar
Wei Yang
Wenjie Zi
Keyi Tang
Chenyang Huang
Jackie C.K. Cheung
S. Prince
Yanshuai Cao
AI4CE
113
69
0
30 Dec 2020
Deriving Contextualised Semantic Features from BERT (and Other
  Transformer Model) Embeddings
Deriving Contextualised Semantic Features from BERT (and Other Transformer Model) Embeddings
Jacob Turton
D. Vinson
Robert Smith
44
25
0
30 Dec 2020
DynaSent: A Dynamic Benchmark for Sentiment Analysis
DynaSent: A Dynamic Benchmark for Sentiment Analysis
Christopher Potts
Zhengxuan Wu
Atticus Geiger
Douwe Kiela
299
80
0
30 Dec 2020
CMV-BERT: Contrastive multi-vocab pretraining of BERT
Wei-wei Zhu
Daniel Cheung
SSLVLM
72
0
0
29 Dec 2020
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust
  Task-oriented Dialog Systems
RADDLE: An Evaluation Benchmark and Analysis Platform for Robust Task-oriented Dialog Systems
Baolin Peng
Chunyuan Li
Zhu Zhang
Chenguang Zhu
Jinchao Li
Jianfeng Gao
69
50
0
29 Dec 2020
Multiple Structural Priors Guided Self Attention Network for Language
  Understanding
Multiple Structural Priors Guided Self Attention Network for Language Understanding
Le Qi
Yu Zhang
Qingyu Yin
Ting Liu
25
1
0
29 Dec 2020
TensorX: Extensible API for Neural Network Model Design and Deployment
TensorX: Extensible API for Neural Network Model Design and Deployment
Davide Nunes
Luis M. Antunes
31
0
0
29 Dec 2020
Universal Sentence Representation Learning with Conditional Masked
  Language Model
Universal Sentence Representation Learning with Conditional Masked Language Model
Ziyi Yang
Yinfei Yang
Daniel Cer
Jax Law
Eric F. Darve
SSL
93
58
0
28 Dec 2020
DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced
  Bengali Language
DeepHateExplainer: Explainable Hate Speech Detection in Under-resourced Bengali Language
Md. Rezaul Karim
Sumon Dey
Tanhim Islam
Sagor Sarker
Mehadi Hasan Menon
Kabir Hossain
Bharathi Raja Chakravarthi
Md. Azam Hossain
Stefan Decker
107
85
0
28 Dec 2020
BURT: BERT-inspired Universal Representation from Learning Meaningful
  Segment
BURT: BERT-inspired Universal Representation from Learning Meaningful Segment
Yian Li
Hai Zhao
SSL
37
0
0
28 Dec 2020
Adaptive Convolution for Semantic Role Labeling
Adaptive Convolution for Semantic Role Labeling
Kashif Munir
Hai Zhao
Z. Li
34
12
0
27 Dec 2020
SG-Net: Syntax Guided Transformer for Language Representation
SG-Net: Syntax Guided Transformer for Language Representation
Zhuosheng Zhang
Yuwei Wu
Junru Zhou
Sufeng Duan
Hai Zhao
Rui Wang
129
38
0
27 Dec 2020
LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification
LOREN: Logic-Regularized Reasoning for Interpretable Fact Verification
Jiangjie Chen
Qiaoben Bao
Changzhi Sun
Xinbo Zhang
Jiaze Chen
Hao Zhou
Yanghua Xiao
Lei Li
LRM
115
34
0
25 Dec 2020
Leveraging GPT-2 for Classifying Spam Reviews with Limited Labeled Data
  via Adversarial Training
Leveraging GPT-2 for Classifying Spam Reviews with Limited Labeled Data via Adversarial Training
Athirai Aravazhi Irissappane
Hanfei Yu
Yankun Shen
Anubha Agrawal
Gray Stanton
52
9
0
24 Dec 2020
Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act
  Recognition and Sentiment Classification
Co-GAT: A Co-Interactive Graph Attention Network for Joint Dialog Act Recognition and Sentiment Classification
Libo Qin
Zhouyang Li
Wanxiang Che
Minheng Ni
Ting Liu
84
66
0
24 Dec 2020
A Survey on Visual Transformer
A Survey on Visual Transformer
Kai Han
Yunhe Wang
Hanting Chen
Xinghao Chen
Jianyuan Guo
...
Chunjing Xu
Yixing Xu
Zhaohui Yang
Yiman Zhang
Dacheng Tao
ViT
237
2,293
0
23 Dec 2020
Intrinsic Dimensionality Explains the Effectiveness of Language Model
  Fine-Tuning
Intrinsic Dimensionality Explains the Effectiveness of Language Model Fine-Tuning
Armen Aghajanyan
Luke Zettlemoyer
Sonal Gupta
110
578
1
22 Dec 2020
RealFormer: Transformer Likes Residual Attention
RealFormer: Transformer Likes Residual Attention
Ruining He
Anirudh Ravula
Bhargav Kanagal
Joshua Ainslie
76
111
0
21 Dec 2020
Neural Methods for Effective, Efficient, and Exposure-Aware Information
  Retrieval
Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval
Bhaskar Mitra
83
6
0
21 Dec 2020
CSKG: The CommonSense Knowledge Graph
CSKG: The CommonSense Knowledge Graph
Filip Ilievski
Pedro A. Szekely
Bin Zhang
91
89
0
21 Dec 2020
An End-to-End Document-Level Neural Discourse Parser Exploiting
  Multi-Granularity Representations
An End-to-End Document-Level Neural Discourse Parser Exploiting Multi-Granularity Representations
Ke Shi
Zhengyuan Liu
Nancy F. Chen
41
7
0
21 Dec 2020
BERTChem-DDI : Improved Drug-Drug Interaction Prediction from text using
  Chemical Structure Information
BERTChem-DDI : Improved Drug-Drug Interaction Prediction from text using Chemical Structure Information
Ishani Mondal
18
10
0
21 Dec 2020
Domain specific BERT representation for Named Entity Recognition of lab
  protocol
Domain specific BERT representation for Named Entity Recognition of lab protocol
Tejas Vaidhya
Ayush Kaushal
62
9
0
21 Dec 2020
Towards Incorporating Entity-specific Knowledge Graph Information in
  Predicting Drug-Drug Interactions
Towards Incorporating Entity-specific Knowledge Graph Information in Predicting Drug-Drug Interactions
Ishani Mondal
15
3
0
21 Dec 2020
Adaptive Bi-directional Attention: Exploring Multi-Granularity
  Representations for Machine Reading Comprehension
Adaptive Bi-directional Attention: Exploring Multi-Granularity Representations for Machine Reading Comprehension
Nuo Chen
Fenglin Liu
Chenyu You
Peilin Zhou
Yuexian Zou
77
31
0
20 Dec 2020
Breaking Writer's Block: Low-cost Fine-tuning of Natural Language
  Generation Models
Breaking Writer's Block: Low-cost Fine-tuning of Natural Language Generation Models
Alexandre Duval
Thomas Lamson
Gael de Leseleuc de Kerouara
Matthias Gallé
59
0
0
19 Dec 2020
Exploring Fluent Query Reformulations with Text-to-Text Transformers and
  Reinforcement Learning
Exploring Fluent Query Reformulations with Text-to-Text Transformers and Reinforcement Learning
Jerry Zikun Chen
S. Yu
Haoran Wang
444
5
0
18 Dec 2020
Leveraging Meta-path Contexts for Classification in Heterogeneous
  Information Networks
Leveraging Meta-path Contexts for Classification in Heterogeneous Information Networks
Xiang Li
Danhao Ding
B. Kao
Yizhou Sun
N. Mamoulis
149
49
0
18 Dec 2020
PCT: Point cloud transformer
PCT: Point cloud transformer
Meng-Hao Guo
Junxiong Cai
Zheng-Ning Liu
Tai-Jiang Mu
Ralph Robert Martin
Shimin Hu
ViT3DPC
242
1,650
0
17 Dec 2020
Point Transformer
Point Transformer
Hengshuang Zhao
Li Jiang
Jiaya Jia
Philip Torr
V. Koltun
3DPCViT
35
14
0
16 Dec 2020
Multilingual Evidence Retrieval and Fact Verification to Combat Global
  Disinformation: The Power of Polyglotism
Multilingual Evidence Retrieval and Fact Verification to Combat Global Disinformation: The Power of Polyglotism
Denisa A.O. Roberts
72
3
0
16 Dec 2020
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion
  Recognition
DialogXL: All-in-One XLNet for Multi-Party Conversation Emotion Recognition
Weizhou Shen
Junqing Chen
Xiaojun Quan
Zhixiang Xie
98
207
0
16 Dec 2020
Pre-Training Transformers as Energy-Based Cloze Models
Pre-Training Transformers as Energy-Based Cloze Models
Kevin Clark
Minh-Thang Luong
Quoc V. Le
Christopher D. Manning
77
80
0
15 Dec 2020
EmpLite: A Lightweight Sequence Labeling Model for Emphasis Selection of
  Short Texts
EmpLite: A Lightweight Sequence Labeling Model for Emphasis Selection of Short Texts
Vibhav Agarwal
Sourav Ghosh
Kranti Chalamalasetti
B. Challa
S. Kumari
Harshavardhana
Barath Raj Kandur Raja
31
3
0
15 Dec 2020
*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional
  Task
*-CFQ: Analyzing the Scalability of Machine Learning on a Compositional Task
Dmitry Tsarkov
Tibor Tihon
Nathan Scales
Nikola Momchev
Danila Sinopalnikov
Nathanael Scharli
76
17
0
15 Dec 2020
Primer AI's Systems for Acronym Identification and Disambiguation
Primer AI's Systems for Acronym Identification and Disambiguation
Nicholas Egan
John Bohannon
55
8
0
14 Dec 2020
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
95
407
0
14 Dec 2020
LRC-BERT: Latent-representation Contrastive Knowledge Distillation for
  Natural Language Understanding
LRC-BERT: Latent-representation Contrastive Knowledge Distillation for Natural Language Understanding
Hao Fu
Shaojun Zhou
Qihong Yang
Junjie Tang
Guiquan Liu
Kaikui Liu
Xiaolong Li
119
60
0
14 Dec 2020
Yelp Review Rating Prediction: Machine Learning and Deep Learning Models
Yelp Review Rating Prediction: Machine Learning and Deep Learning Models
Zefang Liu
VLM
49
15
0
12 Dec 2020
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector
  Quantization
DeCoAR 2.0: Deep Contextualized Acoustic Representations with Vector Quantization
Shaoshi Ling
Yuzong Liu
83
107
0
11 Dec 2020
Morphology Matters: A Multilingual Language Modeling Analysis
Morphology Matters: A Multilingual Language Modeling Analysis
Hyunji Hayley Park
Katherine J. Zhang
Coleman Haley
K. Steimel
Han Liu
Lane Schwartz
105
49
0
11 Dec 2020
Reinforced Multi-Teacher Selection for Knowledge Distillation
Reinforced Multi-Teacher Selection for Knowledge Distillation
Fei Yuan
Linjun Shou
J. Pei
Wutao Lin
Ming Gong
Yan Fu
Daxin Jiang
75
124
0
11 Dec 2020
Previous
123...505152...697071
Next