ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,522 papers shown
Title
Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense
  Enriched Representations
Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations
Geetanjali Bihani
Julia Taylor Rayz
68
13
0
22 Apr 2021
A Short Survey of Pre-trained Language Models for Conversational AI-A
  NewAge in NLP
A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP
Munazza Zaib
Quan Z. Sheng
W. Zhang
77
72
0
22 Apr 2021
Should we Stop Training More Monolingual Models, and Simply Use Machine
  Translation Instead?
Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
T. Isbister
F. Carlsson
Magnus Sahlgren
95
25
0
21 Apr 2021
Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement
  Verification and Evidence Finding with Tables
Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement Verification and Evidence Finding with Tables
Xiaoyi Ruan
Meizhi Jin
Jian Ma
Haiqing Yang
Lian-Xin Jiang
Yang Mo
Mengyuan Zhou
LMTD
65
2
0
21 Apr 2021
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Michael Hahn
Dan Jurafsky
Richard Futrell
197
22
0
21 Apr 2021
Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense
  Reasoning Tasks
Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense Reasoning Tasks
Lisa Bauer
Mohit Bansal
48
19
0
20 Apr 2021
Enhancing Cognitive Models of Emotions with Representation Learning
Enhancing Cognitive Models of Emotions with Representation Learning
Yuting Guo
Jinho Choi
48
5
0
20 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
382
2,555
0
20 Apr 2021
Efficient pre-training objectives for Transformers
Efficient pre-training objectives for Transformers
Luca Di Liello
Matteo Gabburo
Alessandro Moschitti
42
15
0
20 Apr 2021
Training Value-Aligned Reinforcement Learning Agents Using a Normative
  Prior
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior
Md Sultan al Nahian
Spencer Frazier
Brent Harrison
Mark O. Riedl
97
19
0
19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal
  Pre-Training
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
60
42
0
19 Apr 2021
TREC Deep Learning Track: Reusable Test Collections in the Large Data
  Regime
TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
E. Voorhees
I. Soboroff
76
52
0
19 Apr 2021
Improving Transformer-Kernel Ranking Model Using Conformer and Query
  Term Independence
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence
Bhaskar Mitra
Sebastian Hofstatter
Hamed Zamani
Nick Craswell
81
8
0
19 Apr 2021
A novel time-frequency Transformer based on self-attention mechanism and
  its application in fault diagnosis of rolling bearings
A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings
Yifei Ding
M. Jia
Qiuhua Miao
Yudong Cao
59
290
0
19 Apr 2021
On the Use of Context for Predicting Citation Worthiness of Sentences in
  Scholarly Articles
On the Use of Context for Predicting Citation Worthiness of Sentences in Scholarly Articles
Rakesh Gosangi
Ravneet Arora
Mohsen Gheisarieha
Debanjan Mahata
Haimin Zhang
45
10
0
18 Apr 2021
Emotion-Regularized Conditional Variational Autoencoder for Emotional
  Response Generation
Emotion-Regularized Conditional Variational Autoencoder for Emotional Response Generation
Yu-Ping Ruan
Zhenhua Ling
DRL
80
16
0
18 Apr 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming
  Few-Shot Prompt Order Sensitivity
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILawLRM
461
1,200
0
18 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for
  Free-form Text Generation
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
297
149
0
18 Apr 2021
A Simple and Effective Positional Encoding for Transformers
A Simple and Effective Positional Encoding for Transformers
Pu-Chin Chen
Henry Tsai
Srinadh Bhojanapalli
Hyung Won Chung
Yin-Wen Chang
Chun-Sung Ferng
120
66
0
18 Apr 2021
Linguistic Dependencies and Statistical Dependence
Linguistic Dependencies and Statistical Dependence
Jacob Louis Hoover
Alessandro Sordoni
Wenyu Du
Timothy J. O'Donnell
75
15
0
18 Apr 2021
"Average" Approximates "First Principal Component"? An Empirical
  Analysis on Representations from Neural Language Models
"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models
Zihan Wang
Chengyu Dong
Jingbo Shang
FAtt
140
4
0
18 Apr 2021
Characterizing Idioms: Conventionality and Contingency
Characterizing Idioms: Conventionality and Contingency
Michaela Socolof
Jackie C.K. Cheung
Michael Wagner
Timothy J. O'Donnell
25
6
0
17 Apr 2021
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained
  Models
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models
Zhengxuan Wu
Nelson F. Liu
Christopher Potts
45
3
0
17 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language
  Models
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MAMedIm
117
170
0
16 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval
Condenser: a Pre-training Architecture for Dense Retrieval
Luyu Gao
Jamie Callan
AI4CE
69
269
0
16 Apr 2021
DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog
  Systems
DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems
Yu Li
Shirley Anugrah Hayati
Weiyan Shi
Zhou Yu
64
5
0
16 Apr 2021
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head
  Generation
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation
Lilin Cheng
Suzhe Wang
Zhimeng Zhang
Yu-qiong Ding
Yixing Zheng
Xin Yu
Changjie Fan
VGen
45
71
0
16 Apr 2021
Time-Stamped Language Model: Teaching Language Models to Understand the
  Flow of Events
Time-Stamped Language Model: Teaching Language Models to Understand the Flow of Events
Hossein Rajaby Faghihi
Parisa Kordjamshidi
65
25
0
15 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
165
248
0
15 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of
  Hierarchical Phrase Structure in Pretrained Language Models
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
Matteo Alleman
J. Mamou
Miguel Rio
Hanlin Tang
Yoon Kim
SueYeon Chung
NAI
100
17
0
15 Apr 2021
Hierarchical Learning for Generation with Long Source Sequences
Hierarchical Learning for Generation with Long Source Sequences
T. Rohde
Xiaoxia Wu
Yinhan Liu
BDLVLM
76
56
0
15 Apr 2021
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language
  Models
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models
Karolina Stañczak
Sagnik Ray Choudhury
Tiago Pimentel
Ryan Cotterell
Isabelle Augenstein
84
24
0
15 Apr 2021
Natural Language Understanding with Privacy-Preserving BERT
Natural Language Understanding with Privacy-Preserving BERT
Chen Qu
Weize Kong
Liu Yang
Mingyang Zhang
Michael Bendersky
Marc Najork
103
76
0
15 Apr 2021
Effect of Post-processing on Contextualized Word Representations
Effect of Post-processing on Contextualized Word Representations
Hassan Sajjad
Firoj Alam
Fahim Dalvi
Nadir Durrani
61
9
0
15 Apr 2021
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution
Ryuto Konno
Shun Kiyono
Yuichiroh Matsubayashi
Hiroki Ouchi
Kentaro Inui
19
10
0
15 Apr 2021
Consistency Training with Virtual Adversarial Discrete Perturbation
Consistency Training with Virtual Adversarial Discrete Perturbation
Jungsoo Park
Gyuwan Kim
Jaewoo Kang
76
15
0
15 Apr 2021
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
  Pre-trained Language Models
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
Yuxuan Lai
Yijia Liu
Yansong Feng
Songfang Huang
Dongyan Zhao
VLMAI4CE
77
38
0
15 Apr 2021
COIL: Revisit Exact Lexical Match in Information Retrieval with
  Contextualized Inverted List
COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List
Luyu Gao
Zhuyun Dai
Jamie Callan
87
220
0
15 Apr 2021
Disentangling Representations of Text by Masking Transformers
Disentangling Representations of Text by Masking Transformers
Xiongyi Zhang
Jan-Willem van de Meent
Byron C. Wallace
DRL
64
21
0
14 Apr 2021
UDALM: Unsupervised Domain Adaptation through Language Modeling
UDALM: Unsupervised Domain Adaptation through Language Modeling
Constantinos F. Karouzos
Georgios Paraskevopoulos
Alexandros Potamianos
72
57
0
14 Apr 2021
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural
  Language Understanding and Generation in E-Commerce
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
Song Xu
Haoran Li
Peng Yuan
Yujia Wang
Youzheng Wu
Xiaodong He
Ying Liu
Bowen Zhou
KELM
91
24
0
14 Apr 2021
Enhancing Interpretable Clauses Semantically using Pretrained Word
  Representation
Enhancing Interpretable Clauses Semantically using Pretrained Word Representation
Rohan Kumar Yadav
Lei Jiao
Ole-Christoffer Granmo
Morten Goodwin
NAI
67
16
0
14 Apr 2021
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual
  Dataset for Counterfactual Detection in Product Reviews
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual Dataset for Counterfactual Detection in Product Reviews
James OÑeill
Polina Rozenshtein
Ryuichi Kiryo
Motoko Kubota
Danushka Bollegala
76
31
0
14 Apr 2021
Knowledge-driven Answer Generation for Conversational Search
Knowledge-driven Answer Generation for Conversational Search
Mariana Leite
Rafael Ferreira
David Semedo
João Magalhães
RALMKELM
78
1
0
14 Apr 2021
AR-LSAT: Investigating Analytical Reasoning of Text
AR-LSAT: Investigating Analytical Reasoning of Text
Wanjun Zhong
Siyuan Wang
Duyu Tang
Zenan Xu
Daya Guo
Jiahai Wang
Jian Yin
Ming Zhou
Nan Duan
ELM
137
44
0
14 Apr 2021
Demystifying BERT: Implications for Accelerator Design
Demystifying BERT: Implications for Accelerator Design
Suchita Pati
Shaizeen Aga
Nuwan Jayasena
Matthew D. Sinclair
LLMAG
88
17
0
14 Apr 2021
Developing a Conversational Recommendation System for Navigating Limited
  Options
Developing a Conversational Recommendation System for Navigating Limited Options
Victor S. Bursztyn
Jennifer Healey
Eunyee Koh
Nedim Lipka
Larry Birnbaum
25
7
0
13 Apr 2021
BERT Embeddings Can Track Context in Conversational Search
BERT Embeddings Can Track Context in Conversational Search
Rafael Ferreira
David Semedo
João Magalhães
AI4TS
55
0
0
13 Apr 2021
The Future is not One-dimensional: Complex Event Schema Induction by
  Graph Modeling for Event Prediction
The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction
Manling Li
Sha Li
Zhenhailong Wang
Lifu Huang
Kyunghyun Cho
Heng Ji
Jiawei Han
Clare R. Voss
114
58
0
13 Apr 2021
Reducing Discontinuous to Continuous Parsing with Pointer Network
  Reordering
Reducing Discontinuous to Continuous Parsing with Pointer Network Reordering
Daniel Fernández-González
Carlos Gómez-Rodríguez
35
11
0
13 Apr 2021
Previous
123...454647...697071
Next