ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,518 papers shown
Title
Tree-structured Attention with Hierarchical Accumulation
Tree-structured Attention with Hierarchical Accumulation
Xuan-Phi Nguyen
Shafiq Joty
Guosheng Lin
R. Socher
58
76
0
19 Feb 2020
Gradient Boosting Neural Networks: GrowNet
Gradient Boosting Neural Networks: GrowNet
Sarkhan Badirli
Xuanqing Liu
Zhengming Xing
Avradeep Bhowmik
Khoa D. Doan
S. Keerthi
FedML
58
87
0
19 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language
  Models
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
55
52
0
18 Feb 2020
A Financial Service Chatbot based on Deep Bidirectional Transformers
A Financial Service Chatbot based on Deep Bidirectional Transformers
S. Yu
Yuxin Chen
Hussain Zaidi
73
35
0
17 Feb 2020
Low-Rank Bottleneck in Multi-head Attention Models
Low-Rank Bottleneck in Multi-head Attention Models
Srinadh Bhojanapalli
Chulhee Yun
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
71
96
0
17 Feb 2020
Convergence of End-to-End Training in Deep Unsupervised Contrastive
  Learning
Convergence of End-to-End Training in Deep Unsupervised Contrastive Learning
Zixin Wen
SSL
68
3
0
17 Feb 2020
Incorporating BERT into Neural Machine Translation
Incorporating BERT into Neural Machine Translation
Jinhua Zhu
Yingce Xia
Lijun Wu
Di He
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
FedMLAIMat
50
360
0
17 Feb 2020
SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word
  Models
SBERT-WK: A Sentence Embedding Method by Dissecting BERT-based Word Models
Bin Wang
C.-C. Jay Kuo
50
156
0
16 Feb 2020
Robustness Verification for Transformers
Robustness Verification for Transformers
Zhouxing Shi
Huan Zhang
Kai-Wei Chang
Minlie Huang
Cho-Jui Hsieh
AAML
89
109
0
16 Feb 2020
UniVL: A Unified Video and Language Pre-Training Model for Multimodal
  Understanding and Generation
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation
Huaishao Luo
Lei Ji
Botian Shi
Haoyang Huang
Nan Duan
Tianrui Li
Jason Li
Xilin Chen
Ming Zhou
VLM
128
438
0
15 Feb 2020
Fine-Tuning Pretrained Language Models: Weight Initializations, Data
  Orders, and Early Stopping
Fine-Tuning Pretrained Language Models: Weight Initializations, Data Orders, and Early Stopping
Jesse Dodge
Gabriel Ilharco
Roy Schwartz
Ali Farhadi
Hannaneh Hajishirzi
Noah A. Smith
107
598
0
15 Feb 2020
TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for
  Efficient Retrieval
TwinBERT: Distilling Knowledge to Twin-Structured BERT Models for Efficient Retrieval
Wenhao Lu
Jian Jiao
Ruofei Zhang
60
50
0
14 Feb 2020
Stress Test Evaluation of Transformer-based Models in Natural Language
  Understanding Tasks
Stress Test Evaluation of Transformer-based Models in Natural Language Understanding Tasks
Carlos Aspillaga
Andrés Carvallo
Vladimir Araujo
ELM
75
31
0
14 Feb 2020
Transformer on a Diet
Transformer on a Diet
Chenguang Wang
Zihao Ye
Aston Zhang
Zheng Zhang
Alex Smola
91
8
0
14 Feb 2020
FQuAD: French Question Answering Dataset
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
118
100
0
14 Feb 2020
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural
  Language Processing
HULK: An Energy Efficiency Benchmark Platform for Responsible Natural Language Processing
Xiyou Zhou
Zhiyu Zoey Chen
Xiaoyong Jin
Wenjie Wang
78
34
0
14 Feb 2020
On Layer Normalization in the Transformer Architecture
On Layer Normalization in the Transformer Architecture
Ruibin Xiong
Yunchang Yang
Di He
Kai Zheng
Shuxin Zheng
Chen Xing
Huishuai Zhang
Yanyan Lan
Liwei Wang
Tie-Yan Liu
AI4CE
160
1,002
0
12 Feb 2020
Feature Importance Estimation with Self-Attention Networks
Feature Importance Estimation with Self-Attention Networks
Blaž Škrlj
Jannis Brugger
Nada Lavrac
Matej Petković
FAttMILM
88
52
0
11 Feb 2020
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
ReClor: A Reading Comprehension Dataset Requiring Logical Reasoning
Weihao Yu
Zihang Jiang
Yanfei Dong
Jiashi Feng
LRM
173
255
0
11 Feb 2020
Exploring Chemical Space using Natural Language Processing Methodologies
  for Drug Discovery
Exploring Chemical Space using Natural Language Processing Methodologies for Drug Discovery
Hakime Öztürk
Arzucan Özgür
P. Schwaller
Teodoro Laino
Elif Özkirimli
98
122
0
10 Feb 2020
Localized Flood DetectionWith Minimal Labeled Social Media Data Using
  Transfer Learning
Localized Flood DetectionWith Minimal Labeled Social Media Data Using Transfer Learning
Neha Singh
Nirmalya Roy
A. Gangopadhyay
81
6
0
10 Feb 2020
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
How Much Knowledge Can You Pack Into the Parameters of a Language Model?
Adam Roberts
Colin Raffel
Noam M. Shazeer
KELM
144
898
0
10 Feb 2020
Pre-training Tasks for Embedding-based Large-scale Retrieval
Pre-training Tasks for Embedding-based Large-scale Retrieval
Wei-Cheng Chang
Felix X. Yu
Yin-Wen Chang
Yiming Yang
Sanjiv Kumar
RALM
102
306
0
10 Feb 2020
Blank Language Models
Blank Language Models
T. Shen
Victor Quach
Regina Barzilay
Tommi Jaakkola
288
73
0
08 Feb 2020
Snippext: Semi-supervised Opinion Mining with Augmented Data
Snippext: Semi-supervised Opinion Mining with Augmented Data
Zhengjie Miao
Yuliang Li
Xiaolan Wang
W. Tan
RALMVLM
76
91
0
07 Feb 2020
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
348
201
0
07 Feb 2020
perm2vec: Graph Permutation Selection for Decoding of Error Correction
  Codes using Self-Attention
perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention
Nir Raviv
Avi Caciularu
Tomer Raviv
Jacob Goldberger
Yair Be’ery
63
8
0
06 Feb 2020
Aligning the Pretraining and Finetuning Objectives of Language Models
Aligning the Pretraining and Finetuning Objectives of Language Models
Nuo Wang Pierse
Jing Lu
AI4CE
35
2
0
05 Feb 2020
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
Ruize Wang
Duyu Tang
Nan Duan
Zhongyu Wei
Xuanjing Huang
Jianshu Ji
Guihong Cao
Daxin Jiang
Ming Zhou
KELM
148
557
0
05 Feb 2020
CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of
  Text Generation
CoTK: An Open-Source Toolkit for Fast Development and Fair Evaluation of Text Generation
Fei Huang
Dazhen Wan
Zhihong Shao
Pei Ke
Jian Guan
Yilin Niu
Xiaoyan Zhu
Minlie Huang
50
5
0
03 Feb 2020
Schema-Guided Dialogue State Tracking Task at DSTC8
Schema-Guided Dialogue State Tracking Task at DSTC8
Abhinav Rastogi
Xiaoxue Zang
Srinivas Sunkara
Raghav Gupta
Pranav Khaitan
75
42
0
02 Feb 2020
Fine-Tuning BERT for Schema-Guided Zero-Shot Dialogue State Tracking
Fine-Tuning BERT for Schema-Guided Zero-Shot Dialogue State Tracking
Yu-Ping Ruan
Zhenhua Ling
Jia-Chen Gu
Quan Liu
73
20
0
01 Feb 2020
Are Pre-trained Language Models Aware of Phrases? Simple but Strong
  Baselines for Grammar Induction
Are Pre-trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction
Taeuk Kim
Jihun Choi
Daniel Edmiston
Sang-goo Lee
70
90
0
30 Jan 2020
Retrospective Reader for Machine Reading Comprehension
Retrospective Reader for Machine Reading Comprehension
Zhuosheng Zhang
Junjie Yang
Hai Zhao
RALM
104
227
0
27 Jan 2020
Asking Questions the Human Way: Scalable Question-Answer Generation from
  Text Corpus
Asking Questions the Human Way: Scalable Question-Answer Generation from Text Corpus
Bang Liu
Haojie Wei
Di Niu
Haolan Chen
Yancheng He
88
93
0
27 Jan 2020
DUMA: Reading Comprehension with Transposition Thinking
DUMA: Reading Comprehension with Transposition Thinking
Pengfei Zhu
Hai Zhao
Xiaoguang Li
AI4CE
84
35
0
26 Jan 2020
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework
  for Natural Language Generation
ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation
Dongling Xiao
Han Zhang
Yukun Li
Yu Sun
Hao Tian
Hua Wu
Haifeng Wang
85
127
0
26 Jan 2020
Scaling Laws for Neural Language Models
Scaling Laws for Neural Language Models
Jared Kaplan
Sam McCandlish
T. Henighan
Tom B. Brown
B. Chess
R. Child
Scott Gray
Alec Radford
Jeff Wu
Dario Amodei
668
4,937
0
23 Jan 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CEAIMat
128
1,818
0
22 Jan 2020
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised
  Image-Text Data
ImageBERT: Cross-modal Pre-training with Large-scale Weak-supervised Image-Text Data
Di Qi
Lin Su
Jianwei Song
Edward Cui
Taroon Bharti
Arun Sacheti
VLM
116
263
0
22 Jan 2020
Elephant in the Room: An Evaluation Framework for Assessing Adversarial
  Examples in NLP
Elephant in the Room: An Evaluation Framework for Assessing Adversarial Examples in NLP
Ying Xu
Xu Zhong
Antonio Jimeno Yepes
Jey Han Lau
AAML
63
10
0
22 Jan 2020
Exploiting Cloze Questions for Few Shot Text Classification and Natural
  Language Inference
Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
Timo Schick
Hinrich Schütze
369
1,629
0
21 Jan 2020
A multimodal deep learning approach for named entity recognition from
  social media
A multimodal deep learning approach for named entity recognition from social media
M. Asgari-Chenaghlu
M. Feizi-Derakhshi
Leili Farzinvash
M. Balafar
C. Motamed
60
29
0
19 Jan 2020
A Common Semantic Space for Monolingual and Cross-Lingual
  Meta-Embeddings
A Common Semantic Space for Monolingual and Cross-Lingual Meta-Embeddings
G. R. Claramunt
Rodrigo Agerri
German Rigau
67
7
0
17 Jan 2020
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark
  for Chinese
CLUENER2020: Fine-grained Named Entity Recognition Dataset and Benchmark for Chinese
Liang Xu
Yu Tong
Qianqian Dong
Yixuan Liao
Cong Yu
Yin Tian
Weitang Liu
Lu Li
Caiquan Liu
Xuanwei Zhang
96
54
0
13 Jan 2020
Natural Image Matting via Guided Contextual Attention
Natural Image Matting via Guided Contextual Attention
Yaoyi Li
Hongtao Lu
82
169
0
13 Jan 2020
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence
  Pre-training
ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
Weizhen Qi
Yu Yan
Yeyun Gong
Dayiheng Liu
Nan Duan
Jiusheng Chen
Ruofei Zhang
Ming Zhou
AI4TS
140
450
0
13 Jan 2020
Exploring and Improving Robustness of Multi Task Deep Neural Networks
  via Domain Agnostic Defenses
Exploring and Improving Robustness of Multi Task Deep Neural Networks via Domain Agnostic Defenses
Kashyap Coimbatore Murali
AAMLOOD
29
0
0
11 Jan 2020
PatentTransformer-2: Controlling Patent Text Generation by Structural
  Metadata
PatentTransformer-2: Controlling Patent Text Generation by Structural Metadata
Jieh-Sheng Lee
J. Hsiang
29
10
0
11 Jan 2020
An Exploration of Embodied Visual Exploration
An Exploration of Embodied Visual Exploration
Santhosh Kumar Ramakrishnan
Dinesh Jayaraman
Kristen Grauman
LM&Ro
93
100
0
07 Jan 2020
Previous
123...656667...697071
Next