ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,520 papers shown
Title
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive
  Language Identification using Pre-trained Language Models
Galileo at SemEval-2020 Task 12: Multi-lingual Learning for Offensive Language Identification using Pre-trained Language Models
Shuohuan Wang
Jiaxiang Liu
Ouyang Xuan
Yu Sun
68
36
0
07 Oct 2020
What Can We Learn from Collective Human Opinions on Natural Language
  Inference Data?
What Can We Learn from Collective Human Opinions on Natural Language Inference Data?
Yixin Nie
Xiang Zhou
Joey Tianyi Zhou
108
138
0
07 Oct 2020
Improving QA Generalization by Concurrent Modeling of Multiple Biases
Improving QA Generalization by Concurrent Modeling of Multiple Biases
Mingzhu Wu
N. Moosavi
Andreas Rucklé
Iryna Gurevych
AI4CE
72
17
0
07 Oct 2020
A Self-Refinement Strategy for Noise Reduction in Grammatical Error
  Correction
A Self-Refinement Strategy for Noise Reduction in Grammatical Error Correction
Masato Mita
Shun Kiyono
Masahiro Kaneko
Jun Suzuki
Kentaro Inui
68
14
0
07 Oct 2020
Pre-training Multilingual Neural Machine Translation by Leveraging
  Alignment Information
Pre-training Multilingual Neural Machine Translation by Leveraging Alignment Information
Zehui Lin
Xiao Pan
Mingxuan Wang
Xipeng Qiu
Jiangtao Feng
Hao Zhou
Lei Li
62
129
0
07 Oct 2020
On Negative Interference in Multilingual Models: Findings and A
  Meta-Learning Treatment
On Negative Interference in Multilingual Models: Findings and A Meta-Learning Treatment
Zirui Wang
Zachary Chase Lipton
Yulia Tsvetkov
93
32
0
06 Oct 2020
Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic
  Priming
Exploring BERT's Sensitivity to Lexical Cues using Tests from Semantic Priming
Kanishka Misra
Allyson Ettinger
Julia Taylor Rayz
71
58
0
06 Oct 2020
A Review on Fact Extraction and Verification
A Review on Fact Extraction and Verification
Giannis Bekoulis
Christina Papagiannopoulou
Nikos Deligiannis
128
45
0
06 Oct 2020
Stepwise Extractive Summarization and Planning with Structured
  Transformers
Stepwise Extractive Summarization and Planning with Structured Transformers
Shashi Narayan
Joshua Maynez
Jakub Adamek
Daniele Pighin
Blavz Bratanivc
Ryan T. McDonald
77
30
0
06 Oct 2020
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for
  Language Model Adaptation
Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation
Minki Kang
Moonsu Han
Sung Ju Hwang
OOD
81
18
0
06 Oct 2020
Analyzing Individual Neurons in Pre-trained Language Models
Analyzing Individual Neurons in Pre-trained Language Models
Nadir Durrani
Hassan Sajjad
Fahim Dalvi
Yonatan Belinkov
MILM
60
104
0
06 Oct 2020
Poison Attacks against Text Datasets with Conditional Adversarially
  Regularized Autoencoder
Poison Attacks against Text Datasets with Conditional Adversarially Regularized Autoencoder
Alvin Chan
Yi Tay
Yew-Soon Ong
Aston Zhang
SILM
78
58
0
06 Oct 2020
If beam search is the answer, what was the question?
If beam search is the answer, what was the question?
Clara Meister
Tim Vieira
Ryan Cotterell
88
143
0
06 Oct 2020
StyleDGPT: Stylized Response Generation with Pre-trained Language Models
StyleDGPT: Stylized Response Generation with Pre-trained Language Models
Ze Yang
Wei Wu
Can Xu
Xinnian Liang
Jiaqi Bai
Liran Wang
Wei Wang
Zhoujun Li
VLM
133
25
0
06 Oct 2020
LEGAL-BERT: The Muppets straight out of Law School
LEGAL-BERT: The Muppets straight out of Law School
Ilias Chalkidis
Manos Fergadiotis
Prodromos Malakasiotis
Nikolaos Aletras
Ion Androutsopoulos
AILaw
77
265
0
06 Oct 2020
Efficient Meta Lifelong-Learning with Limited Memory
Efficient Meta Lifelong-Learning with Limited Memory
Zirui Wang
Sanket Vaibhav Mehta
Barnabás Póczós
J. Carbonell
CLLKELM
81
76
0
06 Oct 2020
Help! Need Advice on Identifying Advice
Help! Need Advice on Identifying Advice
Venkata S Govindarajan
Benjamin Chen
Rebecca Warholic
K. Erk
Junyi Jessy Li
49
17
0
06 Oct 2020
Modeling Preconditions in Text with a Crowd-sourced Dataset
Modeling Preconditions in Text with a Crowd-sourced Dataset
Heeyoung Kwon
Mahnaz Koupaee
Pratyush Singh
Gargi Sawhney
Anmol Shukla
Keerthi Kumar Kallur
Nathanael Chambers
Niranjan Balasubramanian
39
15
0
06 Oct 2020
Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks
Mixup-Transformer: Dynamic Data Augmentation for NLP Tasks
Lichao Sun
Congying Xia
Wenpeng Yin
Tingting Liang
Philip S. Yu
Lifang He
62
36
0
05 Oct 2020
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
KGPT: Knowledge-Grounded Pre-Training for Data-to-Text Generation
Wenhu Chen
Yu-Chuan Su
Xifeng Yan
Wenjie Wang
VLM
139
22
0
05 Oct 2020
PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models'
  features for offensive language recognition
PUM at SemEval-2020 Task 12: Aggregation of Transformer-based models' features for offensive language recognition
P. Janiszewski
Mateusz Skiba
Urszula Waliñska
44
2
0
05 Oct 2020
Pruning Redundant Mappings in Transformer Models via Spectral-Normalized
  Identity Prior
Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior
Zi Lin
Jeremiah Zhe Liu
Ziao Yang
Nan Hua
Dan Roth
96
47
0
05 Oct 2020
How Effective is Task-Agnostic Data Augmentation for Pretrained
  Transformers?
How Effective is Task-Agnostic Data Augmentation for Pretrained Transformers?
Shayne Longpre
Yu Wang
Christopher DuBois
ViT
86
85
0
05 Oct 2020
Effective Unsupervised Domain Adaptation with Adversarially Trained
  Language Models
Effective Unsupervised Domain Adaptation with Adversarially Trained Language Models
Thuy-Trang Vu
Dinh Q. Phung
Gholamreza Haffari
86
25
0
05 Oct 2020
On Losses for Modern Language Models
On Losses for Modern Language Models
Stephane Aroca-Ouellette
Frank Rudzicz
81
32
0
04 Oct 2020
Tell Me How to Ask Again: Question Data Augmentation with Controllable
  Rewriting in Continuous Space
Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space
Dayiheng Liu
Yeyun Gong
Jie Fu
Yu Yan
Jiusheng Chen
Jiancheng Lv
Nan Duan
M. Zhou
51
37
0
04 Oct 2020
Cross-Lingual Transfer Learning for Complex Word Identification
Cross-Lingual Transfer Learning for Complex Word Identification
George-Eduard Zaharia
Dumitru-Clementin Cercel
M. Dascalu
49
13
0
02 Oct 2020
Syntax Representation in Word Embeddings and Neural Networks -- A Survey
Syntax Representation in Word Embeddings and Neural Networks -- A Survey
Tomasz Limisiewicz
David Marecek
NAI
79
9
0
02 Oct 2020
LUKE: Deep Contextualized Entity Representations with Entity-aware
  Self-attention
LUKE: Deep Contextualized Entity Representations with Entity-aware Self-attention
Ikuya Yamada
Akari Asai
Hiroyuki Shindo
Hideaki Takeda
Yuji Matsumoto
135
676
0
02 Oct 2020
Which *BERT? A Survey Organizing Contextualized Encoders
Which *BERT? A Survey Organizing Contextualized Encoders
Patrick Xia
Shijie Wu
Benjamin Van Durme
62
50
0
02 Oct 2020
JAKET: Joint Pre-training of Knowledge Graph and Language Understanding
JAKET: Joint Pre-training of Knowledge Graph and Language Understanding
Donghan Yu
Chenguang Zhu
Yiming Yang
Michael Zeng
KELM
83
145
0
02 Oct 2020
Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
Data Transfer Approaches to Improve Seq-to-Seq Retrosynthesis
Katsuhiko Ishiguro
K. Ujihara
R. Sawada
Hirotaka Akita
Masaaki Kotera
115
6
0
02 Oct 2020
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting
  Arithmetic Coding
Near-imperceptible Neural Linguistic Steganography via Self-Adjusting Arithmetic Coding
Jiaming Shen
Heng Ji
Jiawei Han
51
38
0
01 Oct 2020
CoLAKE: Contextualized Language and Knowledge Embedding
CoLAKE: Contextualized Language and Knowledge Embedding
Tianxiang Sun
Yunfan Shao
Xipeng Qiu
Qipeng Guo
Yaru Hu
Xuanjing Huang
Zheng Zhang
KELM
116
185
0
01 Oct 2020
Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID
  Twitter BERT and Bagging Ensemble Technique based on Plurality Voting
Phonemer at WNUT-2020 Task 2: Sequence Classification Using COVID Twitter BERT and Bagging Ensemble Technique based on Plurality Voting
Anshul Wadhawan
47
7
0
01 Oct 2020
An Empirical Investigation Towards Efficient Multi-Domain Language Model
  Pre-training
An Empirical Investigation Towards Efficient Multi-Domain Language Model Pre-training
Kristjan Arumae
Q. Sun
Parminder Bhatia
63
15
0
01 Oct 2020
Examining the rhetorical capacities of neural language models
Examining the rhetorical capacities of neural language models
Zining Zhu
Chuer Pan
Mohamed Abdalla
Frank Rudzicz
71
10
0
01 Oct 2020
Pea-KD: Parameter-efficient and Accurate Knowledge Distillation on BERT
Pea-KD: Parameter-efficient and Accurate Knowledge Distillation on BERT
Ikhyun Cho
U. Kang
25
1
0
30 Sep 2020
Measuring Systematic Generalization in Neural Proof Generation with
  Transformers
Measuring Systematic Generalization in Neural Proof Generation with Transformers
Nicolas Angelard-Gontier
Koustuv Sinha
Siva Reddy
C. Pal
LRM
106
64
0
30 Sep 2020
MaP: A Matrix-based Prediction Approach to Improve Span Extraction in
  Machine Reading Comprehension
MaP: A Matrix-based Prediction Approach to Improve Span Extraction in Machine Reading Comprehension
Huaishao Luo
Yu Shi
Ming Gong
Linjun Shou
Tianrui Li
28
4
0
29 Sep 2020
Attention that does not Explain Away
Attention that does not Explain Away
Nan Ding
Xinjie Fan
Zhenzhong Lan
Dale Schuurmans
Radu Soricut
54
3
0
29 Sep 2020
Contrastive Distillation on Intermediate Representations for Language
  Model Compression
Contrastive Distillation on Intermediate Representations for Language Model Compression
S. Sun
Zhe Gan
Yu Cheng
Yuwei Fang
Shuohang Wang
Jingjing Liu
VLM
78
73
0
29 Sep 2020
CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced
  Pre-Trained Language Models
CokeBERT: Contextual Knowledge Selection and Embedding towards Enhanced Pre-Trained Language Models
Yusheng Su
Xu Han
Zhengyan Zhang
Peng Li
Zhiyuan Liu
Yankai Lin
Jie Zhou
Maosong Sun
ODL
80
25
0
29 Sep 2020
Utterance-level Dialogue Understanding: An Empirical Study
Utterance-level Dialogue Understanding: An Empirical Study
Deepanway Ghosal
Navonil Majumder
Rada Mihalcea
Soujanya Poria
90
23
0
29 Sep 2020
A Simple but Tough-to-Beat Data Augmentation Approach for Natural
  Language Understanding and Generation
A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation
Dinghan Shen
Ming Zheng
Yelong Shen
Yanru Qu
Weizhu Chen
AAML
99
132
0
29 Sep 2020
Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells
Zero-Shot Clinical Acronym Expansion via Latent Meaning Cells
Griffin Adams
Mert Ketenci
Shreyas Bhave
A. Perotte
Noémie Elhadad
BDL
46
0
0
29 Sep 2020
Improve Transformer Models with Better Relative Position Embeddings
Improve Transformer Models with Better Relative Position Embeddings
Zhiheng Huang
Davis Liang
Peng Xu
Bing Xiang
ViT
79
132
0
28 Sep 2020
Domain Adversarial Fine-Tuning as an Effective Regularizer
Domain Adversarial Fine-Tuning as an Effective Regularizer
Giorgos Vernikos
Katerina Margatina
Alexandra Chronopoulou
Ion Androutsopoulos
70
15
0
28 Sep 2020
Deep Transformers with Latent Depth
Deep Transformers with Latent Depth
Xian Li
Asa Cooper Stickland
Yuqing Tang
X. Kong
71
23
0
28 Sep 2020
Accelerating Multi-Model Inference by Merging DNNs of Different Weights
Accelerating Multi-Model Inference by Merging DNNs of Different Weights
Joo Seong Jeong
Soojeong Kim
Gyeong-In Yu
Yunseong Lee
Byung-Gon Chun
FedMLMoMeAI4CE
29
7
0
28 Sep 2020
Previous
123...555657...697071
Next