ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,518 papers shown
Title
FastBERT: a Self-distilling BERT with Adaptive Inference Time
FastBERT: a Self-distilling BERT with Adaptive Inference Time
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
97
361
0
05 Apr 2020
Unsupervised Domain Clusters in Pretrained Language Models
Unsupervised Domain Clusters in Pretrained Language Models
Roee Aharoni
Yoav Goldberg
103
252
0
05 Apr 2020
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space
Chunyuan Li
Xiang Gao
Yuan Li
Baolin Peng
Xiujun Li
Yizhe Zhang
Jianfeng Gao
SSLDRL
86
182
0
05 Apr 2020
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training,
  Understanding and Generation
XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation
Yaobo Liang
Nan Duan
Yeyun Gong
Ning Wu
Fenfei Guo
...
Shuguang Liu
Fan Yang
Daniel Fernando Campos
Rangan Majumder
Ming Zhou
ELMVLM
115
350
0
03 Apr 2020
Deep Entity Matching with Pre-Trained Language Models
Deep Entity Matching with Pre-Trained Language Models
Yuliang Li
Jinfeng Li
Yoshihiko Suhara
A. Doan
W. Tan
VLM
108
391
0
01 Apr 2020
Information Leakage in Embedding Models
Information Leakage in Embedding Models
Congzheng Song
A. Raghunathan
MIACV
92
274
0
31 Mar 2020
Give your Text Representation Models some Love: the Case for Basque
Give your Text Representation Models some Love: the Case for Basque
Rodrigo Agerri
Iñaki San Vicente
Jon Ander Campos
Ander Barrena
X. Saralegi
Aitor Soroa Etxabe
Eneko Agirre
59
63
0
31 Mar 2020
Abstractive Text Summarization based on Language Model Conditioning and
  Locality Modeling
Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling
Dmitrii Aksenov
J. Moreno-Schneider
Peter Bourgonje
Robert Schwarzenberg
Leonhard Hennig
Georg Rehm
115
26
0
29 Mar 2020
Actor-Transformers for Group Activity Recognition
Actor-Transformers for Group Activity Recognition
Kirill Gavrilyuk
Ryan Sanford
Mehrsan Javan
Cees G. M. Snoek
ViT
73
181
0
28 Mar 2020
A Survey of Deep Learning for Scientific Discovery
A Survey of Deep Learning for Scientific Discovery
M. Raghu
Erica Schmidt
OODAI4CE
182
123
0
26 Mar 2020
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
VIOLIN: A Large-Scale Dataset for Video-and-Language Inference
J. Liu
Wenhu Chen
Yu Cheng
Zhe Gan
Licheng Yu
Yiming Yang
Jingjing Liu
MLLMVGen
102
70
0
25 Mar 2020
The value of text for small business default prediction: A deep learning
  approach
The value of text for small business default prediction: A deep learning approach
Matthew Stevenson
Christophe Mues
Cristián Bravo
104
80
0
19 Mar 2020
TTTTTackling WinoGrande Schemas
TTTTTackling WinoGrande Schemas
Sheng-Chieh Lin
Jheng-Hong Yang
Rodrigo Nogueira
Ming-Feng Tsai
Chuan-Ju Wang
Jimmy Lin
55
6
0
18 Mar 2020
Pre-trained Models for Natural Language Processing: A Survey
Pre-trained Models for Natural Language Processing: A Survey
Xipeng Qiu
Tianxiang Sun
Yige Xu
Yunfan Shao
Ning Dai
Xuanjing Huang
LM&MAVLM
393
1,500
0
18 Mar 2020
Calibration of Pre-trained Transformers
Calibration of Pre-trained Transformers
Shrey Desai
Greg Durrett
UQLM
347
302
0
17 Mar 2020
Overview of the TREC 2019 deep learning track
Overview of the TREC 2019 deep learning track
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
E. Voorhees
248
496
0
17 Mar 2020
A Survey on Contextual Embeddings
A Survey on Contextual Embeddings
Qi Liu
Matt J. Kusner
Phil Blunsom
276
151
0
16 Mar 2020
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language
  Understanding
TRANS-BLSTM: Transformer with Bidirectional LSTM for Language Understanding
Zhiheng Huang
Peng Xu
Davis Liang
Ajay K. Mishra
Bing Xiang
40
31
0
16 Mar 2020
Finnish Language Modeling with Deep Transformer Models
Finnish Language Modeling with Deep Transformer Models
Abhilash Jain
Aku Rouhe
Stig-Arne Gronroos
M. Kurimo
14
0
0
14 Mar 2020
Using word embeddings to improve the discriminability of co-occurrence
  text networks
Using word embeddings to improve the discriminability of co-occurrence text networks
L. Quispe
J. V. Tohalino
D. R. Amancio
21
1
0
13 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical
  Model
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
85
112
0
13 Mar 2020
Efficient Content-Based Sparse Attention with Routing Transformers
Efficient Content-Based Sparse Attention with Routing Transformers
Aurko Roy
M. Saffar
Ashish Vaswani
David Grangier
MoE
401
606
0
12 Mar 2020
Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings
Hurtful Words: Quantifying Biases in Clinical Contextual Word Embeddings
H. Zhang
Amy X. Lu
Mohamed Abdalla
Matthew B. A. McDermott
Marzyeh Ghassemi
75
176
0
11 Mar 2020
Hybrid Attention-Based Transformer Block Model for Distant Supervision
  Relation Extraction
Hybrid Attention-Based Transformer Block Model for Distant Supervision Relation Extraction
Yan Xiao
Yaochu Jin
Ran Cheng
K. Hao
70
32
0
10 Mar 2020
Neuro-symbolic Architectures for Context Understanding
Neuro-symbolic Architectures for Context Understanding
A. Oltramari
Jonathan M Francis
C. Henson
Kaixin Ma
Ruwan Wickramarachchi
NAIAI4CE
66
29
0
09 Mar 2020
IMRAM: Iterative Matching with Recurrent Attention Memory for
  Cross-Modal Image-Text Retrieval
IMRAM: Iterative Matching with Recurrent Attention Memory for Cross-Modal Image-Text Retrieval
Hui Chen
Guiguang Ding
Xudong Liu
Zijia Lin
Ji Liu
Jungong Han
78
326
0
08 Mar 2020
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in
  Natural Language Inference
HypoNLI: Exploring the Artificial Patterns of Hypothesis-only Bias in Natural Language Inference
Tianyu Liu
Xin Zheng
Baobao Chang
Zhifang Sui
111
24
0
05 Mar 2020
A Study on Efficiency, Accuracy and Document Structure for Answer
  Sentence Selection
A Study on Efficiency, Accuracy and Document Structure for Answer Sentence Selection
Daniele Bonadiman
Alessandro Moschitti
RALM
68
10
0
04 Mar 2020
jiant: A Software Toolkit for Research on General-Purpose Text
  Understanding Models
jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models
Yada Pruksachatkun
Philip Yeres
Haokun Liu
Jason Phang
Phu Mon Htut
Alex Jinpeng Wang
Ian Tenney
Samuel R. Bowman
SSeg
48
94
0
04 Mar 2020
SeMemNN: A Semantic Matrix-Based Memory Neural Network for Text
  Classification
SeMemNN: A Semantic Matrix-Based Memory Neural Network for Text Classification
Changzeng Fu
Chaoran Liu
C. Ishi
Yuichiro Yoshikawa
H. Ishiguro
47
17
0
04 Mar 2020
XGPT: Cross-modal Generative Pre-Training for Image Captioning
XGPT: Cross-modal Generative Pre-Training for Image Captioning
Qiaolin Xia
Haoyang Huang
Nan Duan
Dongdong Zhang
Lei Ji
Zhifang Sui
Edward Cui
Taroon Bharti
Xin Liu
Ming Zhou
MLLMVLM
103
76
0
03 Mar 2020
Heterogeneous Graph Transformer
Heterogeneous Graph Transformer
Ziniu Hu
Yuxiao Dong
Kuansan Wang
Yizhou Sun
299
1,217
0
03 Mar 2020
Med7: a transferable clinical natural language processing model for
  electronic health records
Med7: a transferable clinical natural language processing model for electronic health records
Andrey Kormilitzin
N. Vaci
Qiang Liu
A. Nevado-Holgado
97
120
0
03 Mar 2020
A Question-Centric Model for Visual Question Answering in Medical
  Imaging
A Question-Centric Model for Visual Question Answering in Medical Imaging
Minh H. Vu
Tommy Löfstedt
T. Nyholm
Raphael Sznitman
MedIm
76
61
0
02 Mar 2020
Style Example-Guided Text Generation using Generative Adversarial
  Transformers
Style Example-Guided Text Generation using Generative Adversarial Transformers
Kuo-Hao Zeng
Mohammad Shoeybi
Ming-Yuan Liu
GAN
93
18
0
02 Mar 2020
Depth-Adaptive Graph Recurrent Network for Text Classification
Depth-Adaptive Graph Recurrent Network for Text Classification
Yijin Liu
Fandong Meng
Jinan Xu
Jinan Xu
Jie Zhou
46
3
0
29 Feb 2020
AraBERT: Transformer-based Model for Arabic Language Understanding
AraBERT: Transformer-based Model for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
166
975
0
28 Feb 2020
UniLMv2: Pseudo-Masked Language Models for Unified Language Model
  Pre-Training
UniLMv2: Pseudo-Masked Language Models for Unified Language Model Pre-Training
Hangbo Bao
Li Dong
Furu Wei
Wenhui Wang
Nan Yang
...
Yu Wang
Songhao Piao
Jianfeng Gao
Ming Zhou
H. Hon
AI4CE
88
397
0
28 Feb 2020
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural
  Language Processing
TextBrewer: An Open-Source Knowledge Distillation Toolkit for Natural Language Processing
Ziqing Yang
Yiming Cui
Zhipeng Chen
Wanxiang Che
Ting Liu
Shijin Wang
Guoping Hu
VLM
75
48
0
28 Feb 2020
Few-shot Natural Language Generation for Task-Oriented Dialog
Few-shot Natural Language Generation for Task-Oriented Dialog
Baolin Peng
Chenguang Zhu
Chunyuan Li
Xiujun Li
Jinchao Li
Michael Zeng
Jianfeng Gao
91
201
0
27 Feb 2020
A Primer in BERTology: What we know about how BERT works
A Primer in BERTology: What we know about how BERT works
Anna Rogers
Olga Kovaleva
Anna Rumshisky
OffRL
143
1,511
0
27 Feb 2020
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Compressing Large-Scale Transformer-Based Models: A Case Study on BERT
Prakhar Ganesh
Yao Chen
Xin Lou
Mohammad Ali Khan
Yifan Yang
Hassan Sajjad
Preslav Nakov
Deming Chen
Marianne Winslett
AI4CE
138
201
0
27 Feb 2020
Disentangling Adaptive Gradient Methods from Learning Rates
Disentangling Adaptive Gradient Methods from Learning Rates
Naman Agarwal
Rohan Anil
Elad Hazan
Tomer Koren
Cyril Zhang
109
38
0
26 Feb 2020
Multi-task Learning with Multi-head Attention for Multi-choice Reading
  Comprehension
Multi-task Learning with Multi-head Attention for Multi-choice Reading Comprehension
H. Wan
122
13
0
26 Feb 2020
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression
  of Pre-Trained Transformers
MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers
Wenhui Wang
Furu Wei
Li Dong
Hangbo Bao
Nan Yang
Ming Zhou
VLM
255
1,285
0
25 Feb 2020
KEML: A Knowledge-Enriched Meta-Learning Framework for Lexical Relation
  Classification
KEML: A Knowledge-Enriched Meta-Learning Framework for Lexical Relation Classification
Chengyu Wang
Minghui Qiu
Jun Huang
Xiaofeng He
VLMKELM
106
13
0
25 Feb 2020
Low-Resource Knowledge-Grounded Dialogue Generation
Low-Resource Knowledge-Grounded Dialogue Generation
Xueliang Zhao
Wei Wu
Chongyang Tao
Can Xu
Dongyan Zhao
Rui Yan
117
110
0
24 Feb 2020
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation
Yige Xu
Xipeng Qiu
L. Zhou
Xuanjing Huang
83
67
0
24 Feb 2020
Compressing BERT: Studying the Effects of Weight Pruning on Transfer
  Learning
Compressing BERT: Studying the Effects of Weight Pruning on Transfer Learning
Mitchell A. Gordon
Kevin Duh
Nicholas Andrews
VLM
84
343
0
19 Feb 2020
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Zhangyin Feng
Daya Guo
Duyu Tang
Nan Duan
Xiaocheng Feng
...
Linjun Shou
Bing Qin
Ting Liu
Daxin Jiang
Ming Zhou
224
2,717
0
19 Feb 2020
Previous
123...646566...697071
Next