ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXivPDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 1,487 papers shown
Title
Rethinking Positional Encoding
Rethinking Positional Encoding
Jianqiao Zheng
Sameera Ramasinghe
Simon Lucey
27
51
0
06 Jul 2021
Sarcasm Detection: A Comparative Study
Sarcasm Detection: A Comparative Study
Hamed Yaghoobian
H. Arabnia
Khaled Rasheed
31
22
0
05 Jul 2021
DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling
DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling
Lanqing Xue
Kaitao Song
Duocai Wu
Xu Tan
N. Zhang
Tao Qin
Weiqiang Zhang
Tie-Yan Liu
37
37
0
05 Jul 2021
DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature
DRIFT: A Toolkit for Diachronic Analysis of Scientific Literature
Abheesht Sharma
Gunjan Chhablani
Harshit Pandey
Rajaswa Patil
33
7
0
02 Jul 2021
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and
  Generation
OPT: Omni-Perception Pre-Trainer for Cross-Modal Understanding and Generation
Jing Liu
Xinxin Zhu
Fei Liu
Longteng Guo
Zijia Zhao
...
Weining Wang
Hanqing Lu
Shiyu Zhou
Jiajun Zhang
Jinqiao Wang
39
37
0
01 Jul 2021
Elbert: Fast Albert with Confidence-Window Based Early Exit
Elbert: Fast Albert with Confidence-Window Based Early Exit
Keli Xie
Siyuan Lu
Meiqi Wang
Zhongfeng Wang
22
20
0
01 Jul 2021
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin
  Information
ChineseBERT: Chinese Pretraining Enhanced by Glyph and Pinyin Information
Zijun Sun
Xiaoya Li
Xiaofei Sun
Yuxian Meng
Xiang Ao
Qing He
Fei Wu
Jiwei Li
SSeg
57
184
0
30 Jun 2021
The Values Encoded in Machine Learning Research
The Values Encoded in Machine Learning Research
Abeba Birhane
Pratyusha Kalluri
Dallas Card
William Agnew
Ravit Dotan
Michelle Bao
41
275
0
29 Jun 2021
Exploring the Efficacy of Automatically Generated Counterfactuals for
  Sentiment Analysis
Exploring the Efficacy of Automatically Generated Counterfactuals for Sentiment Analysis
Linyi Yang
Jiazheng Li
Padraig Cunningham
Yue Zhang
Barry Smyth
Ruihai Dong
27
47
0
29 Jun 2021
SCARF: Self-Supervised Contrastive Learning using Random Feature
  Corruption
SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
Dara Bahri
Heinrich Jiang
Yi Tay
Donald Metzler
SSL
28
164
0
29 Jun 2021
Time-Aware Language Models as Temporal Knowledge Bases
Time-Aware Language Models as Temporal Knowledge Bases
Bhuwan Dhingra
Jeremy R. Cole
Julian Martin Eisenschlos
D. Gillick
Jacob Eisenstein
William W. Cohen
KELM
30
266
0
29 Jun 2021
R-Drop: Regularized Dropout for Neural Networks
R-Drop: Regularized Dropout for Neural Networks
Xiaobo Liang
Lijun Wu
Juntao Li
Yue Wang
Qi Meng
Tao Qin
Wei Chen
Hao Fei
Tie-Yan Liu
49
424
0
28 Jun 2021
Core Challenges in Embodied Vision-Language Planning
Core Challenges in Embodied Vision-Language Planning
Jonathan M Francis
Nariaki Kitamura
Felix Labelle
Xiaopeng Lu
Ingrid Navarro
Jean Oh
LM&Ro
54
45
0
26 Jun 2021
Learning Language and Multimodal Privacy-Preserving Markers of Mood from
  Mobile Data
Learning Language and Multimodal Privacy-Preserving Markers of Mood from Mobile Data
Paul Pu Liang
Terrance Liu
Anna Cai
Michal Muszynski
Ryo Ishii
Nicholas B. Allen
Randy P. Auerbach
David Brent
Ruslan Salakhutdinov
Louis-Philippe Morency
40
16
0
24 Jun 2021
VOLO: Vision Outlooker for Visual Recognition
VOLO: Vision Outlooker for Visual Recognition
Li-xin Yuan
Qibin Hou
Zihang Jiang
Jiashi Feng
Shuicheng Yan
ViT
54
315
0
24 Jun 2021
Probabilistic Attention for Interactive Segmentation
Probabilistic Attention for Interactive Segmentation
Prasad Gabbur
Manjot Bilkhu
J. Movellan
39
13
0
23 Jun 2021
Towards Long-Form Video Understanding
Towards Long-Form Video Understanding
Chaoxia Wu
Philipp Krahenbuhl
VLM
ViT
59
166
0
21 Jun 2021
Anomaly Detection in Dynamic Graphs via Transformer
Anomaly Detection in Dynamic Graphs via Transformer
Yixin Liu
Shirui Pan
Yu Guang Wang
Fei Xiong
Liang Wang
Qingfeng Chen
V. C. Lee
34
91
0
18 Jun 2021
BEiT: BERT Pre-Training of Image Transformers
BEiT: BERT Pre-Training of Image Transformers
Hangbo Bao
Li Dong
Songhao Piao
Furu Wei
ViT
68
2,751
0
15 Jun 2021
Delving Deep into the Generalization of Vision Transformers under
  Distribution Shifts
Delving Deep into the Generalization of Vision Transformers under Distribution Shifts
Chongzhi Zhang
Mingyuan Zhang
Shanghang Zhang
Daisheng Jin
Qiang-feng Zhou
Zhongang Cai
Haiyu Zhao
Xianglong Liu
Ziwei Liu
21
103
0
14 Jun 2021
Determinantal Beam Search
Determinantal Beam Search
Clara Meister
Martina Forster
Ryan Cotterell
19
13
0
14 Jun 2021
SAS: Self-Augmentation Strategy for Language Model Pre-training
SAS: Self-Augmentation Strategy for Language Model Pre-training
Yifei Xu
Jingqiao Zhang
Ru He
Liangzhu Ge
Chao Yang
Cheng Yang
Ying Wu
42
1
0
14 Jun 2021
Pre-Trained Models: Past, Present and Future
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
58
818
0
14 Jun 2021
InfoBehavior: Self-supervised Representation Learning for Ultra-long
  Behavior Sequence via Hierarchical Grouping
InfoBehavior: Self-supervised Representation Learning for Ultra-long Behavior Sequence via Hierarchical Grouping
Runshi Liu
Pengda Qin
Yuhong Li
Weigao Wen
Dong Li
Kefeng Deng
Qiang Wu
AI4TS
15
0
0
13 Jun 2021
Can Transformer Language Models Predict Psychometric Properties?
Can Transformer Language Models Predict Psychometric Properties?
Antonio Laverghetta
Animesh Nighojkar
Jamshidbek Mirzakhalov
John Licato
LM&MA
38
14
0
12 Jun 2021
Neural Combinatory Constituency Parsing
Neural Combinatory Constituency Parsing
Zhousi Chen
Longtu Zhang
Aizhan Imankulova
Mamoru Komachi
40
2
0
12 Jun 2021
Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Leveraging Pre-trained Language Model for Speech Sentiment Analysis
Suwon Shon
Pablo Brusco
Jing Pan
Kyu Jeong Han
Shinji Watanabe
17
16
0
11 Jun 2021
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot
  Learning for Structured Data
What Can Knowledge Bring to Machine Learning? -- A Survey of Low-shot Learning for Structured Data
Yang Hu
Adriane P. Chapman
Guihua Wen
Dame Wendy Hall
46
24
0
11 Jun 2021
CAT: Cross Attention in Vision Transformer
CAT: Cross Attention in Vision Transformer
Hezheng Lin
Xingyi Cheng
Xiangyu Wu
Fan Yang
Dong Shen
Zhongyuan Wang
Qing Song
Wei Yuan
ViT
35
149
0
10 Jun 2021
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
MusicBERT: Symbolic Music Understanding with Large-Scale Pre-Training
Mingliang Zeng
Xu Tan
Rui Wang
Zeqian Ju
Tao Qin
Tie-Yan Liu
22
129
0
10 Jun 2021
Semantic-aware Binary Code Representation with BERT
Semantic-aware Binary Code Representation with BERT
Hyungjoon Koo
Soyeon Park
Daejin Choi
Taesoo Kim
27
23
0
10 Jun 2021
Instantaneous Grammatical Error Correction with Shallow Aggressive
  Decoding
Instantaneous Grammatical Error Correction with Shallow Aggressive Decoding
Xin Sun
Tao Ge
Furu Wei
Houfeng Wang
25
62
0
09 Jun 2021
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation
Subhabrata Mukherjee
Ahmed Hassan Awadallah
Jianfeng Gao
19
22
0
08 Jun 2021
Measuring and Improving BERT's Mathematical Abilities by Predicting the
  Order of Reasoning
Measuring and Improving BERT's Mathematical Abilities by Predicting the Order of Reasoning
Piotr Pikekos
Henryk Michalewski
Mateusz Malinowski
35
28
0
07 Jun 2021
Layered gradient accumulation and modular pipeline parallelism: fast and
  efficient training of large language models
Layered gradient accumulation and modular pipeline parallelism: fast and efficient training of large language models
J. Lamy-Poirier
MoE
29
8
0
04 Jun 2021
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained
  Transformer Compression
ERNIE-Tiny : A Progressive Distillation Framework for Pretrained Transformer Compression
Weiyue Su
Xuyi Chen
Shi Feng
Jiaxiang Liu
Weixin Liu
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
34
13
0
04 Jun 2021
Self-supervised Dialogue Learning for Spoken Conversational Question
  Answering
Self-supervised Dialogue Learning for Spoken Conversational Question Answering
Nuo Chen
Chenyu You
Yuexian Zou
SSL
28
33
0
04 Jun 2021
Defending Democracy: Using Deep Learning to Identify and Prevent
  Misinformation
Defending Democracy: Using Deep Learning to Identify and Prevent Misinformation
Anusua Trivedi
Alyssa Suhm
Prathamesh Mahankal
Subhiksha Mukuntharaj
Meghana D. Parab
Malvika Mohan
Meredith Berger
Arathi Sethumadhavan
A. Jaiman
Rahul Dodhia
21
0
0
03 Jun 2021
Defending Against Backdoor Attacks in Natural Language Generation
Defending Against Backdoor Attacks in Natural Language Generation
Xiaofei Sun
Xiaoya Li
Yuxian Meng
Xiang Ao
Fei Wu
Jiwei Li
Tianwei Zhang
AAML
SILM
33
47
0
03 Jun 2021
Conversational Question Answering: A Survey
Conversational Question Answering: A Survey
Munazza Zaib
Wei Emma Zhang
Quan Z. Sheng
A. Mahmood
Yang Zhang
48
88
0
02 Jun 2021
Corpus-Based Paraphrase Detection Experiments and Review
Corpus-Based Paraphrase Detection Experiments and Review
T. Vrbanec
A. Meštrović
55
31
0
31 May 2021
How transfer learning impacts linguistic knowledge in deep NLP models?
How transfer learning impacts linguistic knowledge in deep NLP models?
Nadir Durrani
Hassan Sajjad
Fahim Dalvi
15
49
0
31 May 2021
A Compression-Compilation Framework for On-mobile Real-time BERT
  Applications
A Compression-Compilation Framework for On-mobile Real-time BERT Applications
Wei Niu
Zhenglun Kong
Geng Yuan
Weiwen Jiang
Jiexiong Guan
Caiwen Ding
Pu Zhao
Sijia Liu
Bin Ren
Yanzhi Wang
MQ
25
4
0
30 May 2021
Directed Acyclic Graph Network for Conversational Emotion Recognition
Directed Acyclic Graph Network for Conversational Emotion Recognition
Weizhou Shen
Siyue Wu
Yunyi Yang
Xiaojun Quan
39
240
0
27 May 2021
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and
  Beyond
LMMS Reloaded: Transformer-based Sense Embeddings for Disambiguation and Beyond
Daniel Loureiro
A. Jorge
Jose Camacho-Collados
35
26
0
26 May 2021
Read, Listen, and See: Leveraging Multimodal Information Helps Chinese
  Spell Checking
Read, Listen, and See: Leveraging Multimodal Information Helps Chinese Spell Checking
Heng-Da Xu
Zhongli Li
Qingyu Zhou
Chao Li
Zizhen Wang
Yunbo Cao
Heyan Huang
Xian-Ling Mao
46
94
0
26 May 2021
Focus Attention: Promoting Faithfulness and Diversity in Summarization
Focus Attention: Promoting Faithfulness and Diversity in Summarization
Rahul Aralikatte
Shashi Narayan
Joshua Maynez
S. Rothe
Ryan T. McDonald
40
45
0
25 May 2021
TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference
TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference
Deming Ye
Yankai Lin
Yufei Huang
Maosong Sun
MQ
27
63
0
25 May 2021
DepressionNet: A Novel Summarization Boosted Deep Framework for
  Depression Detection on Social Media
DepressionNet: A Novel Summarization Boosted Deep Framework for Depression Detection on Social Media
Hamad Zogan
Imran Razzak
Shoaib Jameel
Guandong Xu
19
57
0
23 May 2021
Head-driven Phrase Structure Parsing in O($n^3$) Time Complexity
Head-driven Phrase Structure Parsing in O(n3n^3n3) Time Complexity
Zuchao Li
Junru Zhou
Hai Zhao
Kevin Parnow
26
0
0
20 May 2021
Previous
123...171819...282930
Next