ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,518 papers shown
Title
Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from
  Transformers by Self-supervised Learning of Sketch Gestalt
Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from Transformers by Self-supervised Learning of Sketch Gestalt
Hangyu Lin
Yanwei Fu
Yu-Gang Jiang
Xiangyang Xue
SSL
85
66
0
19 May 2020
Contextual Embeddings: When Are They Worth It?
Contextual Embeddings: When Are They Worth It?
Simran Arora
Avner May
Jian Zhang
Christopher Ré
63
61
0
18 May 2020
Are All Languages Created Equal in Multilingual BERT?
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
84
325
0
18 May 2020
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio
  Representation
Audio ALBERT: A Lite BERT for Self-supervised Learning of Audio Representation
Po-Han Chi
Pei-Hung Chung
Tsung-Han Wu
Chun-Cheng Hsieh
Yen-Hao Chen
Shang-Wen Li
Hung-yi Lee
SSL
95
148
0
18 May 2020
Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory
  Prediction
Spatio-Temporal Graph Transformer Networks for Pedestrian Trajectory Prediction
Cunjun Yu
Xiao Ma
Jiawei Ren
Haiyu Zhao
Shuai Yi
90
475
0
18 May 2020
T-VSE: Transformer-Based Visual Semantic Embedding
T-VSE: Transformer-Based Visual Semantic Embedding
M. Bastan
Arnau Ramisa
Mehmet Tek
ViT
28
7
0
17 May 2020
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data
Pengcheng Yin
Graham Neubig
Wen-tau Yih
Sebastian Riedel
RALMLMTD
135
609
0
17 May 2020
CS-NLP team at SemEval-2020 Task 4: Evaluation of State-of-the-art NLP
  Deep Learning Architectures on Commonsense Reasoning Task
CS-NLP team at SemEval-2020 Task 4: Evaluation of State-of-the-art NLP Deep Learning Architectures on Commonsense Reasoning Task
Sirwe Saeedi
Ali (Aliakbar) Panahi
Seyran Saeedi
A. Fong
ReLMELMLRM
69
12
0
17 May 2020
Adversarial Training for Commonsense Inference
Adversarial Training for Commonsense Inference
L. Pereira
Xiaodong Liu
Fei Cheng
Masayuki Asahara
Ichiro Kobayashi
AAML
52
31
0
17 May 2020
Recurrent Chunking Mechanisms for Long-Text Machine Reading
  Comprehension
Recurrent Chunking Mechanisms for Long-Text Machine Reading Comprehension
Hongyu Gong
Yelong Shen
Dian Yu
Jianshu Chen
Dong Yu
83
43
0
16 May 2020
CERT: Contrastive Self-supervised Learning for Language Understanding
CERT: Contrastive Self-supervised Learning for Language Understanding
Hongchao Fang
Sicheng Wang
Meng Zhou
Jiayuan Ding
P. Xie
ELMSSL
76
345
0
16 May 2020
Finding Experts in Transformer Models
Finding Experts in Transformer Models
Xavier Suau
Luca Zappella
N. Apostoloff
60
31
0
15 May 2020
Spelling Error Correction with Soft-Masked BERT
Spelling Error Correction with Soft-Masked BERT
Shaohua Zhang
Haoran Huang
Jicong Liu
Hang Li
68
214
0
15 May 2020
Deep Learning for Political Science
Deep Learning for Political Science
Kakia Chatsiou
Slava Jankin
AI4CE
68
13
0
13 May 2020
A Mixture of $h-1$ Heads is Better than $h$ Heads
A Mixture of h−1h-1h−1 Heads is Better than hhh Heads
Hao Peng
Roy Schwartz
Dianqi Li
Noah A. Smith
MoE
74
33
0
13 May 2020
Machine Reading Comprehension: The Role of Contextualized Language
  Models and Beyond
Machine Reading Comprehension: The Role of Contextualized Language Models and Beyond
Zhuosheng Zhang
Hai Zhao
Rui Wang
115
63
0
13 May 2020
Large Scale Multi-Actor Generative Dialog Modeling
Large Scale Multi-Actor Generative Dialog Modeling
Alex Boyd
Raul Puri
Mohammad Shoeybi
M. Patwary
Bryan Catanzaro
71
23
0
13 May 2020
Cross-Modality Relevance for Reasoning on Language and Vision
Cross-Modality Relevance for Reasoning on Language and Vision
Chen Zheng
Quan Guo
Parisa Kordjamshidi
LRM
88
36
0
12 May 2020
A Report on the 2020 Sarcasm Detection Shared Task
A Report on the 2020 Sarcasm Detection Shared Task
Debanjan Ghosh
Avijit Vajpayee
Smaranda Muresan
66
61
0
12 May 2020
AttViz: Online exploration of self-attention for transparent neural
  language modeling
AttViz: Online exploration of self-attention for transparent neural language modeling
Blaž Škrlj
Nika Erzen
Shane Sheehan
Saturnino Luz
Marko Robnik-Šikonja
Senja Pollak
33
9
0
12 May 2020
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis
Hao Tian
Can Gao
Xinyan Xiao
Hao Liu
Bolei He
Hua Wu
Haifeng Wang
Feng Wu
73
237
0
12 May 2020
MART: Memory-Augmented Recurrent Transformer for Coherent Video
  Paragraph Captioning
MART: Memory-Augmented Recurrent Transformer for Coherent Video Paragraph Captioning
Jie Lei
Liwei Wang
Yelong Shen
Dong Yu
Tamara L. Berg
Joey Tianyi Zhou
72
191
0
11 May 2020
Enabling Language Models to Fill in the Blanks
Enabling Language Models to Fill in the Blanks
Chris Donahue
Mina Lee
Percy Liang
58
198
0
11 May 2020
CrisisBERT: a Robust Transformer for Crisis Classification and
  Contextual Crisis Embedding
CrisisBERT: a Robust Transformer for Crisis Classification and Contextual Crisis Embedding
Junhua Liu
Trisha Singhal
L. Blessing
Kristin L. Wood
Kwan Hui Lim
49
68
0
11 May 2020
A Deep Learning Approach for Automatic Detection of Fake News
A Deep Learning Approach for Automatic Detection of Fake News
Tanik Saikh
Arkadipta De
Asif Ekbal
P. Bhattacharyya
59
34
0
11 May 2020
How Context Affects Language Models' Factual Predictions
How Context Affects Language Models' Factual Predictions
Fabio Petroni
Patrick Lewis
Aleksandra Piktus
Tim Rocktaschel
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
82
239
0
10 May 2020
Transformer Based Language Models for Similar Text Retrieval and Ranking
Transformer Based Language Models for Similar Text Retrieval and Ranking
Javed Qadrud-Din
Ashraf Bah Rabiou
Ryan S Walker
Ravindra Soni
M. Gajek
Gabriel Pack
A. Rangaraj
41
5
0
10 May 2020
schuBERT: Optimizing Elements of BERT
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar Karnin
86
30
0
09 May 2020
Cyberbullying Detection with Fairness Constraints
Cyberbullying Detection with Fairness Constraints
O. Gencoglu
93
49
0
09 May 2020
Detecting East Asian Prejudice on Social Media
Detecting East Asian Prejudice on Social Media
Bertie Vidgen
Austin Botelho
David A. Broniatowski
E. Guest
Matthew Hall
Helen Z. Margetts
Rebekah Tromble
Zeerak Talat
Scott A. Hale
49
101
0
08 May 2020
Distilling Knowledge from Pre-trained Language Models via Text Smoothing
Distilling Knowledge from Pre-trained Language Models via Text Smoothing
Xing Wu
Yebin Liu
Xiangyang Zhou
Dianhai Yu
47
6
0
08 May 2020
A Systematic Assessment of Syntactic Generalization in Neural Language
  Models
A Systematic Assessment of Syntactic Generalization in Neural Language Models
Jennifer Hu
Jon Gauthier
Peng Qian
Ethan Gotlieb Wilcox
R. Levy
ELM
110
221
0
07 May 2020
COBRA: Contrastive Bi-Modal Representation Algorithm
COBRA: Contrastive Bi-Modal Representation Algorithm
Vishaal Udandarao
A. Maiti
Deepak Srivatsav
Suryatej Reddy Vyalla
Yifang Yin
R. Shah
81
23
0
07 May 2020
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural
  Machine Translation
JASS: Japanese-specific Sequence to Sequence Pre-training for Neural Machine Translation
Zhuoyuan Mao
Fabien Cromierès
Raj Dabre
Haiyue Song
Sadao Kurohashi
73
4
0
07 May 2020
Quda: Natural Language Queries for Visual Data Analytics
Quda: Natural Language Queries for Visual Data Analytics
Siwei Fu
Kai Xiong
Xiaodong Ge
Siliang Tang
Wei Chen
Yingcai Wu
120
27
0
07 May 2020
The Cascade Transformer: an Application for Efficient Answer Sentence
  Selection
The Cascade Transformer: an Application for Efficient Answer Sentence Selection
Luca Soldaini
Alessandro Moschitti
90
44
0
05 May 2020
Spatio-Temporal Event Segmentation and Localization for Wildlife
  Extended Videos
Spatio-Temporal Event Segmentation and Localization for Wildlife Extended Videos
R. Mounir
R. Gula
J. Theuerkauf
Sudeep Sarkar
24
0
0
05 May 2020
Communication-Efficient Distributed Stochastic AUC Maximization with
  Deep Neural Networks
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks
Zhishuai Guo
Mingrui Liu
Zhuoning Yuan
Li Shen
Wei Liu
Tianbao Yang
93
42
0
05 May 2020
Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term
  Importance Estimation and Neural Query Rewriting
Multi-Stage Conversational Passage Retrieval: An Approach to Fusing Term Importance Estimation and Neural Query Rewriting
Sheng-Chieh Lin
Jheng-Hong Yang
Rodrigo Nogueira
Ming-Feng Tsai
Chuan-Ju Wang
Jimmy J. Lin
84
24
0
05 May 2020
ImpactCite: An XLNet-based method for Citation Impact Analysis
ImpactCite: An XLNet-based method for Citation Impact Analysis
Dominique Mercier
Syed Tahseen Raza Rizvi
Vikas Rajashekar
Andreas Dengel
Sheraz Ahmed
52
16
0
05 May 2020
Exploring Controllable Text Generation Techniques
Exploring Controllable Text Generation Techniques
Shrimai Prabhumoye
A. Black
Ruslan Salakhutdinov
AI4CE
215
91
0
04 May 2020
CAiRE-COVID: A Question Answering and Query-focused Multi-Document
  Summarization System for COVID-19 Scholarly Information Management
CAiRE-COVID: A Question Answering and Query-focused Multi-Document Summarization System for COVID-19 Scholarly Information Management
Jane Polak Scowcroft
Yan Xu
Tiezheng Yu
Farhad Bin Siddique
Elham J. Barezi
Pascale Fung
RALM
48
31
0
04 May 2020
To Test Machine Comprehension, Start by Defining Comprehension
To Test Machine Comprehension, Start by Defining Comprehension
Jesse Dunietz
Greg Burnham
Akash Bharadwaj
Owen Rambow
Jennifer Chu-Carroll
D. Ferrucci
FaML
118
65
0
04 May 2020
The Sensitivity of Language Models and Humans to Winograd Schema
  Perturbations
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations
Mostafa Abdou
Vinit Ravishankar
Maria Barrett
Yonatan Belinkov
Desmond Elliott
Anders Søgaard
ReLMLRM
113
35
0
04 May 2020
From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of
  Parsing Morphologically-Rich Languages (MRLs)?
From SPMRL to NMRL: What Did We Learn (and Unlearn) in a Decade of Parsing Morphologically-Rich Languages (MRLs)?
Reut Tsarfaty
Dan Bareket
Stav Klein
Amit Seker
79
40
0
04 May 2020
Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop
  Question Answering
Unsupervised Alignment-based Iterative Evidence Retrieval for Multi-hop Question Answering
Vikas Yadav
Steven Bethard
Mihai Surdeanu
RALM
119
51
0
04 May 2020
Similarity Analysis of Contextual Word Representation Models
Similarity Analysis of Contextual Word Representation Models
John M. Wu
Yonatan Belinkov
Hassan Sajjad
Nadir Durrani
Fahim Dalvi
James R. Glass
115
75
0
03 May 2020
How Can We Accelerate Progress Towards Human-like Linguistic
  Generalization?
How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
Tal Linzen
289
195
0
03 May 2020
A Simple Language Model for Task-Oriented Dialogue
A Simple Language Model for Task-Oriented Dialogue
Ehsan Hosseini-Asl
Bryan McCann
Chien-Sheng Wu
Semih Yavuz
R. Socher
143
530
0
02 May 2020
Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected
  Reasoning
Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected Reasoning
H. Trivedi
Niranjan Balasubramanian
Tushar Khot
Ashish Sabharwal
AAML
13
1
0
02 May 2020
Previous
123...616263...697071
Next