Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
v1
v2 (latest)
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 3,522 papers shown
Title
Low Anisotropy Sense Retrofitting (LASeR) : Towards Isotropic and Sense Enriched Representations
Geetanjali Bihani
Julia Taylor Rayz
68
13
0
22 Apr 2021
A Short Survey of Pre-trained Language Models for Conversational AI-A NewAge in NLP
Munazza Zaib
Quan Z. Sheng
W. Zhang
77
72
0
22 Apr 2021
Should we Stop Training More Monolingual Models, and Simply Use Machine Translation Instead?
T. Isbister
F. Carlsson
Magnus Sahlgren
95
25
0
21 Apr 2021
Sattiy at SemEval-2021 Task 9: An Ensemble Solution for Statement Verification and Evidence Finding with Tables
Xiaoyi Ruan
Meizhi Jin
Jian Ma
Haiqing Yang
Lian-Xin Jiang
Yang Mo
Mengyuan Zhou
LMTD
65
2
0
21 Apr 2021
Sensitivity as a Complexity Measure for Sequence Classification Tasks
Michael Hahn
Dan Jurafsky
Richard Futrell
197
22
0
21 Apr 2021
Identify, Align, and Integrate: Matching Knowledge Graphs to Commonsense Reasoning Tasks
Lisa Bauer
Mohit Bansal
48
19
0
20 Apr 2021
Enhancing Cognitive Models of Emotions with Representation Learning
Yuting Guo
Jinho Choi
48
5
0
20 Apr 2021
RoFormer: Enhanced Transformer with Rotary Position Embedding
Jianlin Su
Yu Lu
Shengfeng Pan
Ahmed Murtadha
Bo Wen
Yunfeng Liu
382
2,555
0
20 Apr 2021
Efficient pre-training objectives for Transformers
Luca Di Liello
Matteo Gabburo
Alessandro Moschitti
42
15
0
20 Apr 2021
Training Value-Aligned Reinforcement Learning Agents Using a Normative Prior
Md Sultan al Nahian
Spencer Frazier
Brent Harrison
Mark O. Riedl
97
19
0
19 Apr 2021
Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Chenyi Lei
Shixian Luo
Yong Liu
Wanggui He
Jiamang Wang
Guoxin Wang
Haihong Tang
Chunyan Miao
Houqiang Li
60
42
0
19 Apr 2021
TREC Deep Learning Track: Reusable Test Collections in the Large Data Regime
Nick Craswell
Bhaskar Mitra
Emine Yilmaz
Daniel Fernando Campos
E. Voorhees
I. Soboroff
76
52
0
19 Apr 2021
Improving Transformer-Kernel Ranking Model Using Conformer and Query Term Independence
Bhaskar Mitra
Sebastian Hofstatter
Hamed Zamani
Nick Craswell
81
8
0
19 Apr 2021
A novel time-frequency Transformer based on self-attention mechanism and its application in fault diagnosis of rolling bearings
Yifei Ding
M. Jia
Qiuhua Miao
Yudong Cao
59
290
0
19 Apr 2021
On the Use of Context for Predicting Citation Worthiness of Sentences in Scholarly Articles
Rakesh Gosangi
Ravneet Arora
Mohsen Gheisarieha
Debanjan Mahata
Haimin Zhang
45
10
0
18 Apr 2021
Emotion-Regularized Conditional Variational Autoencoder for Emotional Response Generation
Yu-Ping Ruan
Zhenhua Ling
DRL
80
16
0
18 Apr 2021
Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity
Yao Lu
Max Bartolo
Alastair Moore
Sebastian Riedel
Pontus Stenetorp
AILaw
LRM
461
1,200
0
18 Apr 2021
A Token-level Reference-free Hallucination Detection Benchmark for Free-form Text Generation
Tianyu Liu
Yizhe Zhang
Chris Brockett
Yi Mao
Zhifang Sui
Weizhu Chen
W. Dolan
HILM
297
149
0
18 Apr 2021
A Simple and Effective Positional Encoding for Transformers
Pu-Chin Chen
Henry Tsai
Srinadh Bhojanapalli
Hyung Won Chung
Yin-Wen Chang
Chun-Sung Ferng
120
66
0
18 Apr 2021
Linguistic Dependencies and Statistical Dependence
Jacob Louis Hoover
Alessandro Sordoni
Wenyu Du
Timothy J. O'Donnell
75
15
0
18 Apr 2021
"Average" Approximates "First Principal Component"? An Empirical Analysis on Representations from Neural Language Models
Zihan Wang
Chengyu Dong
Jingbo Shang
FAtt
140
4
0
18 Apr 2021
Characterizing Idioms: Conventionality and Contingency
Michaela Socolof
Jackie C.K. Cheung
Michael Wagner
Timothy J. O'Donnell
25
6
0
17 Apr 2021
Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models
Zhengxuan Wu
Nelson F. Liu
Christopher Potts
45
3
0
17 Apr 2021
AMMU : A Survey of Transformer-based Biomedical Pretrained Language Models
Katikapalli Subramanyam Kalyan
A. Rajasekharan
S. Sangeetha
LM&MA
MedIm
117
170
0
16 Apr 2021
Condenser: a Pre-training Architecture for Dense Retrieval
Luyu Gao
Jamie Callan
AI4CE
69
269
0
16 Apr 2021
DEUX: An Attribute-Guided Framework for Sociable Recommendation Dialog Systems
Yu Li
Shirley Anugrah Hayati
Weiyan Shi
Zhou Yu
64
5
0
16 Apr 2021
Write-a-speaker: Text-based Emotional and Rhythmic Talking-head Generation
Lilin Cheng
Suzhe Wang
Zhimeng Zhang
Yu-qiong Ding
Yixing Zheng
Xin Yu
Changjie Fan
VGen
45
71
0
16 Apr 2021
Time-Stamped Language Model: Teaching Language Models to Understand the Flow of Events
Hossein Rajaby Faghihi
Parisa Kordjamshidi
65
25
0
15 Apr 2021
Gradient-based Adversarial Attacks against Text Transformers
Chuan Guo
Alexandre Sablayrolles
Hervé Jégou
Douwe Kiela
SILM
165
248
0
15 Apr 2021
Syntactic Perturbations Reveal Representational Correlates of Hierarchical Phrase Structure in Pretrained Language Models
Matteo Alleman
J. Mamou
Miguel Rio
Hanlin Tang
Yoon Kim
SueYeon Chung
NAI
100
17
0
15 Apr 2021
Hierarchical Learning for Generation with Long Source Sequences
T. Rohde
Xiaoxia Wu
Yinhan Liu
BDL
VLM
76
56
0
15 Apr 2021
Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models
Karolina Stañczak
Sagnik Ray Choudhury
Tiago Pimentel
Ryan Cotterell
Isabelle Augenstein
84
24
0
15 Apr 2021
Natural Language Understanding with Privacy-Preserving BERT
Chen Qu
Weize Kong
Liu Yang
Mingyang Zhang
Michael Bendersky
Marc Najork
103
76
0
15 Apr 2021
Effect of Post-processing on Contextualized Word Representations
Hassan Sajjad
Firoj Alam
Fahim Dalvi
Nadir Durrani
61
9
0
15 Apr 2021
Pseudo Zero Pronoun Resolution Improves Zero Anaphora Resolution
Ryuto Konno
Shun Kiyono
Yuichiroh Matsubayashi
Hiroki Ouchi
Kentaro Inui
19
10
0
15 Apr 2021
Consistency Training with Virtual Adversarial Discrete Perturbation
Jungsoo Park
Gyuwan Kim
Jaewoo Kang
76
15
0
15 Apr 2021
Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese Pre-trained Language Models
Yuxuan Lai
Yijia Liu
Yansong Feng
Songfang Huang
Dongyan Zhao
VLM
AI4CE
77
38
0
15 Apr 2021
COIL: Revisit Exact Lexical Match in Information Retrieval with Contextualized Inverted List
Luyu Gao
Zhuyun Dai
Jamie Callan
87
220
0
15 Apr 2021
Disentangling Representations of Text by Masking Transformers
Xiongyi Zhang
Jan-Willem van de Meent
Byron C. Wallace
DRL
64
21
0
14 Apr 2021
UDALM: Unsupervised Domain Adaptation through Language Modeling
Constantinos F. Karouzos
Georgios Paraskevopoulos
Alexandros Potamianos
72
57
0
14 Apr 2021
K-PLUG: Knowledge-injected Pre-trained Language Model for Natural Language Understanding and Generation in E-Commerce
Song Xu
Haoran Li
Peng Yuan
Yujia Wang
Youzheng Wu
Xiaodong He
Ying Liu
Bowen Zhou
KELM
91
24
0
14 Apr 2021
Enhancing Interpretable Clauses Semantically using Pretrained Word Representation
Rohan Kumar Yadav
Lei Jiao
Ole-Christoffer Granmo
Morten Goodwin
NAI
67
16
0
14 Apr 2021
I Wish I Would Have Loved This One, But I Didn't -- A Multilingual Dataset for Counterfactual Detection in Product Reviews
James OÑeill
Polina Rozenshtein
Ryuichi Kiryo
Motoko Kubota
Danushka Bollegala
76
31
0
14 Apr 2021
Knowledge-driven Answer Generation for Conversational Search
Mariana Leite
Rafael Ferreira
David Semedo
João Magalhães
RALM
KELM
78
1
0
14 Apr 2021
AR-LSAT: Investigating Analytical Reasoning of Text
Wanjun Zhong
Siyuan Wang
Duyu Tang
Zenan Xu
Daya Guo
Jiahai Wang
Jian Yin
Ming Zhou
Nan Duan
ELM
137
44
0
14 Apr 2021
Demystifying BERT: Implications for Accelerator Design
Suchita Pati
Shaizeen Aga
Nuwan Jayasena
Matthew D. Sinclair
LLMAG
88
17
0
14 Apr 2021
Developing a Conversational Recommendation System for Navigating Limited Options
Victor S. Bursztyn
Jennifer Healey
Eunyee Koh
Nedim Lipka
Larry Birnbaum
25
7
0
13 Apr 2021
BERT Embeddings Can Track Context in Conversational Search
Rafael Ferreira
David Semedo
João Magalhães
AI4TS
55
0
0
13 Apr 2021
The Future is not One-dimensional: Complex Event Schema Induction by Graph Modeling for Event Prediction
Manling Li
Sha Li
Zhenhailong Wang
Lifu Huang
Kyunghyun Cho
Heng Ji
Jiawei Han
Clare R. Voss
114
58
0
13 Apr 2021
Reducing Discontinuous to Continuous Parsing with Pointer Network Reordering
Daniel Fernández-González
Carlos Gómez-Rodríguez
35
11
0
13 Apr 2021
Previous
1
2
3
...
45
46
47
...
69
70
71
Next