ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,518 papers shown
Title
Robust Prediction of Punctuation and Truecasing for Medical ASR
Robust Prediction of Punctuation and Truecasing for Medical ASR
Monica Sunkara
S. Ronanki
Kalpit Dixit
S. Bodapati
Katrin Kirchhoff
54
33
0
04 Jul 2020
Abstractive and mixed summarization for long-single documents
Abstractive and mixed summarization for long-single documents
Roger Barrull
Jugal Kalita
40
0
0
03 Jul 2020
MIRA: Leveraging Multi-Intention Co-click Information in Web-scale
  Document Retrieval using Deep Neural Networks
MIRA: Leveraging Multi-Intention Co-click Information in Web-scale Document Retrieval using Deep Neural Networks
Yusi Zhang
Chuanjie Liu
Angen Luo
Hui Xue
Xuan Shan
Y. Luo
Yiqian Xia
Yuanchi Yan
Haidong Wang
90
6
0
03 Jul 2020
Learn Faster and Forget Slower via Fast and Stable Task Adaptation
Learn Faster and Forget Slower via Fast and Stable Task Adaptation
Farshid Varno
Lucas May Petry
Lisa Di-Jorio
Stan Matwin
CLL
60
2
0
02 Jul 2020
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
DAPPLE: A Pipelined Data Parallel Approach for Training Large Models
Shiqing Fan
Yi Rong
Chen Meng
Zongyan Cao
Siyu Wang
...
Jun Yang
Lixue Xia
Lansong Diao
Xiaoyong Liu
Wei Lin
96
242
0
02 Jul 2020
On Linear Identifiability of Learned Representations
On Linear Identifiability of Learned Representations
Geoffrey Roeder
Luke Metz
Diederik P. Kingma
CML
74
85
0
01 Jul 2020
A Survey on Self-supervised Pre-training for Sequential Transfer
  Learning in Neural Networks
A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks
H. H. Mao
BDLSSL
72
50
0
01 Jul 2020
DocVQA: A Dataset for VQA on Document Images
DocVQA: A Dataset for VQA on Document Images
Minesh Mathew
Dimosthenis Karatzas
C. V. Jawahar
169
748
0
01 Jul 2020
Private Speech Classification with Secure Multiparty Computation
Private Speech Classification with Secure Multiparty Computation
Kyle Bittner
Martine De Cock
Rafael Dowsley
70
1
0
01 Jul 2020
SemEval-2020 Task 4: Commonsense Validation and Explanation
SemEval-2020 Task 4: Commonsense Validation and Explanation
Cunxiang Wang
Shuailong Liang
Yili Jin
Yilong Wang
Xiao-Dan Zhu
Yue Zhang
LRM
141
99
0
01 Jul 2020
Data Movement Is All You Need: A Case Study on Optimizing Transformers
Data Movement Is All You Need: A Case Study on Optimizing Transformers
A. Ivanov
Nikoli Dryden
Tal Ben-Nun
Shigang Li
Torsten Hoefler
142
135
0
30 Jun 2020
Segmentation Approach for Coreference Resolution Task
Segmentation Approach for Coreference Resolution Task
A. Jafari
A. Ghodsi
3DV
32
0
0
30 Jun 2020
Guided Learning of Nonconvex Models through Successive Functional
  Gradient Optimization
Guided Learning of Nonconvex Models through Successive Functional Gradient Optimization
Rie Johnson
Tong Zhang
23
8
0
30 Jun 2020
Learning Sparse Prototypes for Text Generation
Learning Sparse Prototypes for Text Generation
Junxian He
Taylor Berg-Kirkpatrick
Graham Neubig
86
23
0
29 Jun 2020
Transformers are RNNs: Fast Autoregressive Transformers with Linear
  Attention
Transformers are RNNs: Fast Autoregressive Transformers with Linear Attention
Angelos Katharopoulos
Apoorv Vyas
Nikolaos Pappas
Franccois Fleuret
218
1,800
0
29 Jun 2020
Natural Backdoor Attack on Text Data
Natural Backdoor Attack on Text Data
Lichao Sun
SILM
81
41
0
29 Jun 2020
Improving Sequence Tagging for Vietnamese Text Using Transformer-based
  Neural Models
Improving Sequence Tagging for Vietnamese Text Using Transformer-based Neural Models
Viet The Bui
Oanh T. K. Tran
Hong Phuong Le
74
40
0
29 Jun 2020
Knowledge-Aware Language Model Pretraining
Knowledge-Aware Language Model Pretraining
Corby Rosset
Chenyan Xiong
M. Phan
Xia Song
Paul N. Bennett
Saurabh Tiwary
KELM
94
83
0
29 Jun 2020
Rethinking Positional Encoding in Language Pre-training
Rethinking Positional Encoding in Language Pre-training
Guolin Ke
Di He
Tie-Yan Liu
171
299
0
28 Jun 2020
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant
  Supervision
BOND: BERT-Assisted Open-Domain Named Entity Recognition with Distant Supervision
Chen Liang
Yue Yu
Haoming Jiang
Siawpeng Er
Ruijia Wang
T. Zhao
Chao Zhang
OffRL
78
240
0
28 Jun 2020
GPT-GNN: Generative Pre-Training of Graph Neural Networks
GPT-GNN: Generative Pre-Training of Graph Neural Networks
Ziniu Hu
Yuxiao Dong
Kuansan Wang
Kai-Wei Chang
Yizhou Sun
SSLAI4CE
174
567
0
27 Jun 2020
BERTology Meets Biology: Interpreting Attention in Protein Language
  Models
BERTology Meets Biology: Interpreting Attention in Protein Language Models
Jesse Vig
Ali Madani
Lav Varshney
Caiming Xiong
R. Socher
Nazneen Rajani
110
295
0
26 Jun 2020
Train and You'll Miss It: Interactive Model Iteration with Weak
  Supervision and Pre-Trained Embeddings
Train and You'll Miss It: Interactive Model Iteration with Weak Supervision and Pre-Trained Embeddings
Mayee F. Chen
Daniel Y. Fu
Frederic Sala
Sen Wu
Ravi Teja Mullapudi
Fait Poms
Kayvon Fatahalian
Christopher Ré
61
10
0
26 Jun 2020
Pre-training via Paraphrasing
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
102
161
0
26 Jun 2020
Subpopulation Data Poisoning Attacks
Subpopulation Data Poisoning Attacks
Matthew Jagielski
Giorgio Severi
Niklas Pousette Harger
Alina Oprea
AAMLSILM
109
122
0
24 Jun 2020
Supervised Understanding of Word Embeddings
Supervised Understanding of Word Embeddings
H. Yerebakan
Parmeet S. Bhatia
Y. Shinagawa
SSL
43
0
0
23 Jun 2020
Can you tell? SSNet -- a Sagittal Stratum-inspired Neural Network
  Framework for Sentiment Analysis
Can you tell? SSNet -- a Sagittal Stratum-inspired Neural Network Framework for Sentiment Analysis
Apostol T. Vassilev
Munawar Hasan
Honglan Jin
37
1
0
23 Jun 2020
The Depth-to-Width Interplay in Self-Attention
The Depth-to-Width Interplay in Self-Attention
Yoav Levine
Noam Wies
Or Sharir
Hofit Bata
Amnon Shashua
137
46
0
22 Jun 2020
Open-Domain Conversational Agents: Current Progress, Open Problems, and
  Future Directions
Open-Domain Conversational Agents: Current Progress, Open Problems, and Future Directions
Stephen Roller
Y-Lan Boureau
Jason Weston
Antoine Bordes
Emily Dinan
...
Kurt Shuster
Eric Michael Smith
Arthur Szlam
Jack Urbanek
Mary Williamson
LLMAGAI4CE
132
52
0
22 Jun 2020
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and
  Benchmark Datasets
A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets
Chengchang Zeng
Shaobo Li
Qin Li
Jie Hu
Jianjun Hu
114
101
0
21 Jun 2020
A Survey of Syntactic-Semantic Parsing Based on Constituent and
  Dependency Structures
A Survey of Syntactic-Semantic Parsing Based on Constituent and Dependency Structures
Meishan Zhang
SSeg
48
36
0
19 Jun 2020
A Qualitative Evaluation of Language Models on Automatic
  Question-Answering for COVID-19
A Qualitative Evaluation of Language Models on Automatic Question-Answering for COVID-19
David Oniani
Yanshan Wang
71
32
0
19 Jun 2020
SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic
SenWave: Monitoring the Global Sentiments under the COVID-19 Pandemic
Qiang Yang
Hind Alamro
Somayah Albaradei
Adil Salhi
Xiaoting Lv
...
Wei Wang
T. Gojobori
C. Duarte
Xin Gao
Xiangliang Zhang
50
35
0
18 Jun 2020
Infinite attention: NNGP and NTK for deep attention networks
Infinite attention: NNGP and NTK for deep attention networks
Jiri Hron
Yasaman Bahri
Jascha Narain Sohl-Dickstein
Roman Novak
60
116
0
18 Jun 2020
I-BERT: Inductive Generalization of Transformer to Arbitrary Context
  Lengths
I-BERT: Inductive Generalization of Transformer to Arbitrary Context Lengths
Hyoungwook Nam
S. Seo
Vikram Sharma Malithody
Noor Michael
Lang Li
28
1
0
18 Jun 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
70
75
0
16 Jun 2020
Untangling tradeoffs between recurrence and self-attention in neural
  networks
Untangling tradeoffs between recurrence and self-attention in neural networks
Giancarlo Kerg
Bhargav Kanuparthi
Anirudh Goyal
Kyle Goyette
Yoshua Bengio
Guillaume Lajoie
57
9
0
16 Jun 2020
Results of the seventh edition of the BioASQ Challenge
Results of the seventh edition of the BioASQ Challenge
A. Nentidis
K. Bougiatiotis
Anastasia Krithara
George Giannakopoulos
99
62
0
16 Jun 2020
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized
  Embedding Models
PERL: Pivot-based Domain Adaptation for Pre-trained Deep Contextualized Embedding Models
Eyal Ben-David
Carmel Rabinovitz
Roi Reichart
SSL
121
63
0
16 Jun 2020
Preserving Dynamic Attention for Long-Term Spatial-Temporal Prediction
Preserving Dynamic Attention for Long-Term Spatial-Temporal Prediction
Haoxing Lin
Rufan Bai
Weijia Jia
Xinyu Yang
Yongjian You
HAIAI4TS
68
66
0
16 Jun 2020
Self-supervised Learning: Generative or Contrastive
Self-supervised Learning: Generative or Contrastive
Xiao Liu
Fanjin Zhang
Zhenyu Hou
Zhaoyu Wang
Li Mian
Jing Zhang
Jie Tang
SSL
215
1,648
0
15 Jun 2020
Transferring Monolingual Model to Low-Resource Language: The Case of
  Tigrinya
Transferring Monolingual Model to Low-Resource Language: The Case of Tigrinya
Abrhalei Tela
Abraham Woubie
Ville Hautamaki
78
13
0
13 Jun 2020
Mining Implicit Relevance Feedback from User Behavior for Web Question
  Answering
Mining Implicit Relevance Feedback from User Behavior for Web Question Answering
Linjun Shou
Shining Bo
Feixiang Cheng
Ming Gong
J. Pei
Daxin Jiang
69
9
0
13 Jun 2020
Rethinking Pre-training and Self-training
Rethinking Pre-training and Self-training
Barret Zoph
Golnaz Ghiasi
Nayeon Lee
Huayu Chen
Hanxiao Liu
E. D. Cubuk
Quoc V. Le
SSeg
112
656
0
11 Jun 2020
VirTex: Learning Visual Representations from Textual Annotations
VirTex: Learning Visual Representations from Textual Annotations
Karan Desai
Justin Johnson
SSLVLM
173
437
0
11 Jun 2020
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of
  Regulated GAN with Data Augmentation
ScoreGAN: A Fraud Review Detector based on Multi Task Learning of Regulated GAN with Data Augmentation
Saeedreza Shehnepoor
R. Togneri
Wei Liu
Bennamoun
65
4
0
11 Jun 2020
Consolidating Commonsense Knowledge
Consolidating Commonsense Knowledge
Filip Ilievski
Pedro A. Szekely
Jingwei Cheng
Fu Zhang
Ehsan Qasemi
57
19
0
10 Jun 2020
Revisiting Few-sample BERT Fine-tuning
Revisiting Few-sample BERT Fine-tuning
Tianyi Zhang
Felix Wu
Arzoo Katiyar
Kilian Q. Weinberger
Yoav Artzi
180
446
0
10 Jun 2020
MC-BERT: Efficient Language Pre-Training via a Meta Controller
MC-BERT: Efficient Language Pre-Training via a Meta Controller
Zhenhui Xu
Linyuan Gong
Guolin Ke
Di He
Shuxin Zheng
Liwei Wang
Jiang Bian
Tie-Yan Liu
BDL
65
18
0
10 Jun 2020
DFraud3- Multi-Component Fraud Detection freeof Cold-start
DFraud3- Multi-Component Fraud Detection freeof Cold-start
Saeedreza Shehnepoor
R. Togneri
Wei Liu
Bennamoun
AAML
36
5
0
10 Jun 2020
Previous
123...596061...697071
Next