ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,522 papers shown
Title
Enhanced Universal Dependency Parsing with Automated Concatenation of
  Embeddings
Enhanced Universal Dependency Parsing with Automated Concatenation of Embeddings
Xinyu Wang
Zixia Jia
Yong Jiang
Kewei Tu
58
6
0
06 Jul 2021
What Helps Transformers Recognize Conversational Structure? Importance
  of Context, Punctuation, and Labels in Dialog Act Recognition
What Helps Transformers Recognize Conversational Structure? Importance of Context, Punctuation, and Labels in Dialog Act Recognition
Piotr Żelasko
R. Pappagari
Najim Dehak
60
14
0
05 Jul 2021
Sarcasm Detection: A Comparative Study
Sarcasm Detection: A Comparative Study
Hamed Yaghoobian
H. Arabnia
Khaled Rasheed
59
23
0
05 Jul 2021
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language
  Understanding and Generation
ERNIE 3.0: Large-scale Knowledge Enhanced Pre-training for Language Understanding and Generation
Yu Sun
Shuohuan Wang
Shikun Feng
Siyu Ding
Chao Pang
...
Ouyang Xuan
Dianhai Yu
Hao Tian
Hua Wu
Haifeng Wang
119
475
0
05 Jul 2021
DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling
DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling
Lanqing Xue
Kaitao Song
Duocai Wu
Xu Tan
N. Zhang
Tao Qin
Weiqiang Zhang
Tie-Yan Liu
88
38
0
05 Jul 2021
Sentence-level Online Handwritten Chinese Character Recognition
Sentence-level Online Handwritten Chinese Character Recognition
Yunxin Li
Qian Yang
Qingcai Chen
Lin Ma
Baotian Hu
Xiaolong Wang
Yuxin Ding
25
0
0
04 Jul 2021
DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature
DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature
Abheesht Sharma
Gunjan Chhablani
Harshit Pandey
Rajaswa Patil
98
7
0
02 Jul 2021
R2D2: Recursive Transformer based on Differentiable Tree for
  Interpretable Hierarchical Language Modeling
R2D2: Recursive Transformer based on Differentiable Tree for Interpretable Hierarchical Language Modeling
Xiang Hu
Haitao Mi
Zujie Wen
Yafang Wang
Yi Su
Jing Zheng
Gerard de Melo
77
23
0
02 Jul 2021
Improving Human Motion Prediction Through Continual Learning
Improving Human Motion Prediction Through Continual Learning
M. S. Yasar
Tariq Iqbal
3DH
45
15
0
01 Jul 2021
GlyphCRM: Bidirectional Encoder Representation for Chinese Character
  with its Glyph
GlyphCRM: Bidirectional Encoder Representation for Chinese Character with its Glyph
Yunxin Li
Yu Zhao
Baotian Hu
Qingcai Chen
Yang Xiang
Xiaolong Wang
Yuxin Ding
Lin Ma
49
7
0
01 Jul 2021
Ensemble Learning-Based Approach for Improving Generalization Capability
  of Machine Reading Comprehension Systems
Ensemble Learning-Based Approach for Improving Generalization Capability of Machine Reading Comprehension Systems
Razieh Baradaran
Hossein Amirkhani
70
16
0
01 Jul 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and
  Generation
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
95
38
0
01 Jul 2021
Elbert: Fast Albert with Confidence-Window Based Early Exit
Elbert: Fast Albert with Confidence-Window Based Early Exit
Keli Xie
Siyuan Lu
Meiqi Wang
Zhongfeng Wang
58
20
0
01 Jul 2021
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA
Zewen Chi
Shaohan Huang
Li Dong
Shuming Ma
Bo Zheng
...
Payal Bajaj
Xia Song
Xian-Ling Mao
Heyan Huang
Furu Wei
119
121
0
30 Jun 2021
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin
  Information
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information
Zijun Sun
Xiaoya Li
Xiaofei Sun
Yuxian Meng
Xiang Ao
Qing He
Leilei Gan
Jiwei Li
SSeg
147
191
0
30 Jun 2021
The Values Encoded in Machine Learning Research
The Values Encoded in Machine Learning Research
Abeba Birhane
Pratyusha Kalluri
Dallas Card
William Agnew
Ravit Dotan
Michelle Bao
93
295
0
29 Jun 2021
Unified Questioner Transformer for Descriptive Question Generation in
  Goal-Oriented Visual Dialogue
Unified Questioner Transformer for Descriptive Question Generation in Goal-Oriented Visual Dialogue
Shoya Matsumori
Kosuke Shingyouchi
Yukikoko Abe
Yosuke Fukuchi
K. Sugiura
M. Imai
99
16
0
29 Jun 2021
Exploring the Efficacy of Automatically Generated Counterfactuals for
  Sentiment Analysis
Exploring the Efficacy of Automatically Generated Counterfactuals for Sentiment Analysis
Linyi Yang
Jiazheng Li
Padraig Cunningham
Yue Zhang
Barry Smyth
Ruihai Dong
92
48
0
29 Jun 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature
  Corruption
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri
Heinrich Jiang
Yi Tay
Donald Metzler
SSL
74
178
0
29 Jun 2021
Time-Aware Language Models as Temporal Knowledge Bases
Time-Aware Language Models as Temporal Knowledge Bases
Bhuwan Dhingra
Jeremy R. Cole
Julian Martin Eisenschlos
D. Gillick
Jacob Eisenstein
William W. Cohen
KELM
148
282
0
29 Jun 2021
Knowledge Transfer by Discriminative Pre-training for Academic
  Performance Prediction
Knowledge Transfer by Discriminative Pre-training for Academic Performance Prediction
Byungsoo Kim
Hangyeol Yu
Dongmin Shin
Youngduck Choi
39
1
0
28 Jun 2021
Overview of BioASQ 2020: The eighth BioASQ challenge on Large-Scale
  Biomedical Semantic Indexing and Question Answering
Overview of BioASQ 2020: The eighth BioASQ challenge on Large-Scale Biomedical Semantic Indexing and Question Answering
A. Nentidis
Anastasia Krithara
K. Bougiatiotis
Martin Krallinger
Carlos Rodríguez-Penagos
Marta Villegas
George Giannakopoulos
127
32
0
28 Jun 2021
Complexity-based partitioning of CSFI problem instances with
  Transformers
Complexity-based partitioning of CSFI problem instances with Transformers
Luca Benedetto
P. Fantozzi
L. Laura
15
0
0
28 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
90
442
0
28 Jun 2021
Efficient Dialogue State Tracking by Masked Hierarchical Transformer
Efficient Dialogue State Tracking by Masked Hierarchical Transformer
Min Mao
Jiasheng Liu
Jingyao Zhou
Haipang Wu
58
0
0
28 Jun 2021
Answering Chinese Elementary School Social Study Multiple Choice
  Questions
Answering Chinese Elementary School Social Study Multiple Choice Questions
Daniel Lee
Chao-Chun Liang
Keh-Yih Su
29
1
0
26 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
144
48
0
26 Jun 2021
Learning to Sample Replacements for ELECTRA Pre-Training
Learning to Sample Replacements for ELECTRA Pre-Training
Y. Hao
Li Dong
Hangbo Bao
Ke Xu
Furu Wei
MU
50
12
0
25 Jun 2021
A Picture May Be Worth a Hundred Words for Visual Question Answering
A Picture May Be Worth a Hundred Words for Visual Question Answering
Yusuke Hirota
Noa Garcia
Mayu Otani
Chenhui Chu
Yuta Nakashima
Ittetsu Taniguchi
Takao Onoye
ViT
35
4
0
25 Jun 2021
Learning Language and Multimodal Privacy-Preserving Markers of Mood from
  Mobile Data
Learning Language and Multimodal Privacy-Preserving Markers of Mood from Mobile Data
Paul Pu Liang
Terrance Liu
Anna Cai
Michal Muszynski
Ryo Ishii
Nicholas B. Allen
Randy P. Auerbach
David Brent
Ruslan Salakhutdinov
Louis-Philippe Morency
89
18
0
24 Jun 2021
Towards Fully Interpretable Deep Neural Networks: Are We There Yet?
Towards Fully Interpretable Deep Neural Networks: Are We There Yet?
Sandareka Wickramanayake
Wynne Hsu
Mong Li Lee
FaMLAI4CE
46
3
0
24 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
137
328
0
24 Jun 2021
Label Disentanglement in Partition-based Extreme Multilabel
  Classification
Label Disentanglement in Partition-based Extreme Multilabel Classification
Xuanqing Liu
Wei-Cheng Chang
Hsiang-Fu Yu
Cho-Jui Hsieh
Inderjit S. Dhillon
63
11
0
24 Jun 2021
Stable, Fast and Accurate: Kernelized Attention with Relative Positional
  Encoding
Stable, Fast and Accurate: Kernelized Attention with Relative Positional Encoding
Shengjie Luo
Shanda Li
Tianle Cai
Di He
Dinglan Peng
Shuxin Zheng
Guolin Ke
Liwei Wang
Tie-Yan Liu
95
50
0
23 Jun 2021
Probabilistic Attention for Interactive Segmentation
Probabilistic Attention for Interactive Segmentation
Prasad Gabbur
Manjot Bilkhu
J. Movellan
103
13
0
23 Jun 2021
LV-BERT: Exploiting Layer Variety for BERT
LV-BERT: Exploiting Layer Variety for BERT
Weihao Yu
Zihang Jiang
Fei Chen
Qibin Hou
Jiashi Feng
MQ
68
0
0
22 Jun 2021
Towards Long-Form Video Understanding
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLMViT
125
170
0
21 Jun 2021
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive
  Learning
VIMPAC: Video Pre-Training via Masked Token Prediction and Contrastive Learning
Hao Tan
Jie Lei
Thomas Wolf
Joey Tianyi Zhou
118
67
0
21 Jun 2021
CIL: Contrastive Instance Learning Framework for Distantly Supervised
  Relation Extraction
CIL: Contrastive Instance Learning Framework for Distantly Supervised Relation Extraction
Tao Chen
Haizhou Shi
Siliang Tang
Zhigang Chen
Leilei Gan
Yueting Zhuang
61
56
0
21 Jun 2021
It's FLAN time! Summing feature-wise latent representations for
  interpretability
It's FLAN time! Summing feature-wise latent representations for interpretability
An-phi Nguyen
María Rodríguez Martínez
FAtt
37
0
0
18 Jun 2021
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question
  Answering over Knowledge Graphs
SPBERT: An Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs
H. Tran
Long Phan
J. Anibal
B. Nguyen
Truong-Son Nguyen
RALM
26
8
0
18 Jun 2021
How COVID-19 Has Changed Crowdfunding: Evidence From GoFundMe
How COVID-19 Has Changed Crowdfunding: Evidence From GoFundMe
Junda Wang
Xupin Zhang
Jiebo Luo
13
4
0
18 Jun 2021
Anomaly Detection in Dynamic Graphs via Transformer
Anomaly Detection in Dynamic Graphs via Transformer
Yixin Liu
Shirui Pan
Yu Guang Wang
Fei Xiong
Liang Wang
Qingfeng Chen
V. C. Lee
76
98
0
18 Jun 2021
Joining datasets via data augmentation in the label space for neural
  networks
Joining datasets via data augmentation in the label space for neural networks
Jake Zhao
Mingfeng Ou
Linji Xue
Yunkai Cui
Sai Wu
Gang Chen
36
2
0
17 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
399
2,858
0
15 Jun 2021
PairConnect: A Compute-Efficient MLP Alternative to Attention
PairConnect: A Compute-Efficient MLP Alternative to Attention
Zhaozhuo Xu
Minghao Yan
Junyan Zhang
Anshumali Shrivastava
50
1
0
15 Jun 2021
Direction is what you need: Improving Word Embedding Compression in
  Large Language Models
Direction is what you need: Improving Word Embedding Compression in Large Language Models
Klaudia Bałazy
Mohammadreza Banaei
R. Lebret
Jacek Tabor
Karl Aberer
55
7
0
15 Jun 2021
Incorporating Word Sense Disambiguation in Neural Language Models
Incorporating Word Sense Disambiguation in Neural Language Models
Jan Philip Wahle
Terry Ruas
Norman Meuschke
Bela Gipp
67
11
0
15 Jun 2021
Delving Deep into the Generalization of Vision Transformers under
  Distribution Shifts
Delving Deep into the Generalization of Vision Transformers under Distribution Shifts
Chongzhi Zhang
Mingyuan Zhang
Shanghang Zhang
Daisheng Jin
Qiang-feng Zhou
Zhongang Cai
Haiyu Zhao
Xianglong Liu
Ziwei Liu
72
106
0
14 Jun 2021
Determinantal Beam Search
Determinantal Beam Search
Clara Meister
Martina Forster
Ryan Cotterell
76
13
0
14 Jun 2021
Previous
123...414243...697071
Next