ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,518 papers shown
Title
$O(n)$ Connections are Expressive Enough: Universal Approximability of
  Sparse Transformers
O(n)O(n)O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers
Chulhee Yun
Yin-Wen Chang
Srinadh Bhojanapalli
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
65
84
0
08 Jun 2020
Picket: Guarding Against Corrupted Data in Tabular Data during Learning
  and Inference
Picket: Guarding Against Corrupted Data in Tabular Data during Learning and Inference
Zifan Liu
Zhechun Zhou
Theodoros Rekatsinas
50
16
0
08 Jun 2020
Pre-training Polish Transformer-based Language Models at Scale
Pre-training Polish Transformer-based Language Models at Scale
Slawomir Dadas
Michal Perelkiewicz
Rafal Poswiata
98
39
0
07 Jun 2020
BERT Loses Patience: Fast and Robust Inference with Early Exit
BERT Loses Patience: Fast and Robust Inference with Early Exit
Wangchunshu Zhou
Canwen Xu
Tao Ge
Julian McAuley
Ke Xu
Furu Wei
79
343
0
07 Jun 2020
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings
  Contain a Distribution of Human-like Biases
Detecting Emergent Intersectional Biases: Contextualized Word Embeddings Contain a Distribution of Human-like Biases
W. Guo
Aylin Caliskan
59
245
0
06 Jun 2020
A Cross-Task Analysis of Text Span Representations
A Cross-Task Analysis of Text Span Representations
Shubham Toshniwal
Freda Shi
Bowen Shi
Lingyu Gao
Karen Livescu
Kevin Gimpel
88
36
0
06 Jun 2020
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report
  Generation
Auxiliary Signal-Guided Knowledge Encoder-Decoder for Medical Report Generation
Mingjie Li
Fuyu Wang
Xiaojun Chang
Xiaodan Liang
MedIm
86
107
0
06 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
160
100
0
05 Jun 2020
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual
  Representations
DeCLUTR: Deep Contrastive Learning for Unsupervised Textual Representations
John Giorgi
Osvald Nitski
Bo Wang
Gary D. Bader
SSL
157
499
0
05 Jun 2020
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
DeBERTa: Decoding-enhanced BERT with Disentangled Attention
Pengcheng He
Xiaodong Liu
Jianfeng Gao
Weizhu Chen
AAML
195
2,771
0
05 Jun 2020
UFO-BLO: Unbiased First-Order Bilevel Optimization
UFO-BLO: Unbiased First-Order Bilevel Optimization
Valerii Likhosherstov
Xingyou Song
K. Choromanski
Jared Davis
Adrian Weller
130
7
0
05 Jun 2020
CoCon: A Self-Supervised Approach for Controlled Text Generation
CoCon: A Self-Supervised Approach for Controlled Text Generation
Alvin Chan
Yew-Soon Ong
B. Pung
Aston Zhang
Jie Fu
82
86
0
05 Jun 2020
GMAT: Global Memory Augmentation for Transformers
GMAT: Global Memory Augmentation for Transformers
Ankit Gupta
Jonathan Berant
RALM
81
50
0
05 Jun 2020
Understanding Self-Attention of Self-Supervised Audio Transformers
Understanding Self-Attention of Self-Supervised Audio Transformers
Shu-Wen Yang
Andy T. Liu
Hung-yi Lee
58
27
0
05 Jun 2020
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient
  Language Processing
Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing
Zihang Dai
Guokun Lai
Yiming Yang
Quoc V. Le
118
236
0
05 Jun 2020
Auto-decoding Graphs
Auto-decoding Graphs
Sohil Shah
V. Koltun
GNN
60
4
0
04 Jun 2020
Position Masking for Language Models
Position Masking for Language Models
Andy Wagner
T. Mitra
Mrinal Iyer
Godfrey Da Costa
Marc Tremblay
22
5
0
02 Jun 2020
Surprisal-Triggered Conditional Computation with Neural Networks
Surprisal-Triggered Conditional Computation with Neural Networks
Loren Lugosch
Derek Nowrouzezahrai
B. Meyer
75
6
0
02 Jun 2020
A Pairwise Probe for Understanding BERT Fine-Tuning on Machine Reading
  Comprehension
A Pairwise Probe for Understanding BERT Fine-Tuning on Machine Reading Comprehension
Jie Cai
Zhengzhou Zhu
Ping Nie
Qian Liu
AAML
26
7
0
02 Jun 2020
BERT-based Ensembles for Modeling Disclosure and Support in
  Conversational Social Media Text
BERT-based Ensembles for Modeling Disclosure and Support in Conversational Social Media Text
Tanvi Dadu
Kartikey Pant
R. Mamidi
32
9
0
01 Jun 2020
An Effective Contextual Language Modeling Framework for Speech
  Summarization with Augmented Features
An Effective Contextual Language Modeling Framework for Speech Summarization with Augmented Features
Shi-Yan Weng
Tien-Hong Lo
Berlin Chen
57
9
0
01 Jun 2020
Probing Emergent Semantics in Predictive Agents via Question Answering
Probing Emergent Semantics in Predictive Agents via Question Answering
Abhishek Das
Federico Carnevale
Hamza Merzic
Laura Rimell
R. Schneider
...
Alden Hung
Arun Ahuja
S. Clark
Greg Wayne
Felix Hill
81
18
0
01 Jun 2020
Hyperparameter optimization with REINFORCE and Transformers
Hyperparameter optimization with REINFORCE and Transformers
C. Krishna
Ashish Gupta
Swarnim Narayan
Himanshu Rai
Diksha Manchanda
54
2
0
01 Jun 2020
Conversational Machine Comprehension: a Literature Review
Conversational Machine Comprehension: a Literature Review
Somil Gupta
Bhanu Pratap Singh Rawat
Hong Yu
79
22
0
01 Jun 2020
A Survey on Transfer Learning in Natural Language Processing
A Survey on Transfer Learning in Natural Language Processing
Zaid Alyafeai
Maged S. Alshaibani
Irfan Ahmad
91
75
0
31 May 2020
CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with
  Multi-Head Self-Attention Weights based Counterfactual Detection
CNRL at SemEval-2020 Task 5: Modelling Causal Reasoning in Language with Multi-Head Self-Attention Weights based Counterfactual Detection
Rajaswa Patil
V. Baths
31
4
0
31 May 2020
BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with
  Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble
  Learning
BPGC at SemEval-2020 Task 11: Propaganda Detection in News Articles with Multi-Granularity Knowledge Sharing and Linguistic Features based Ensemble Learning
Rajaswa Patil
Somesh Singh
Swati Agarwal
23
8
0
31 May 2020
Stance Prediction for Contemporary Issues: Data and Experiments
Stance Prediction for Contemporary Issues: Data and Experiments
Marjan Hosseinia
Eduard Constantin Dragut
Arjun Mukherjee
63
30
0
29 May 2020
A Comparative Study of Lexical Substitution Approaches based on Neural
  Language Models
A Comparative Study of Lexical Substitution Approaches based on Neural Language Models
N. Arefyev
Boris Sheludko
Alexander Podolskiy
Alexander Panchenko
30
10
0
29 May 2020
ValueNet: A Natural Language-to-SQL System that Learns from Database
  Information
ValueNet: A Natural Language-to-SQL System that Learns from Database Information
Ursin Brunner
Kurt Stockinger
44
10
0
29 May 2020
Using Large Pretrained Language Models for Answering User Queries from
  Product Specifications
Using Large Pretrained Language Models for Answering User Queries from Product Specifications
Kalyani Roy
Smit Shah
Nithish Pai
Jaidam Ramtej
Prajit Prashant Nadkarn
Jyotirmoy Banerjee
Pawan Goyal
Surender Kumar
RALM
39
3
0
29 May 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
1.1K
42,712
0
28 May 2020
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
Syntactic Structure Distillation Pretraining For Bidirectional Encoders
A. Kuncoro
Lingpeng Kong
Daniel Fried
Dani Yogatama
Laura Rimell
Chris Dyer
Phil Blunsom
93
34
0
27 May 2020
CausaLM: Causal Model Explanation Through Counterfactual Language Models
CausaLM: Causal Model Explanation Through Counterfactual Language Models
Amir Feder
Nadav Oved
Uri Shalit
Roi Reichart
CMLLRM
161
162
0
27 May 2020
TRIE: End-to-End Text Reading and Information Extraction for Document
  Understanding
TRIE: End-to-End Text Reading and Information Extraction for Document Understanding
Peng Zhang
Yunlu Xu
Zhanzhan Cheng
Shiliang Pu
Jing Lu
Liang Qiao
Yi Niu
Leilei Gan
SyDa
95
103
0
27 May 2020
Machine Learning-Based Unbalance Detection of a Rotating Shaft Using
  Vibration Data
Machine Learning-Based Unbalance Detection of a Rotating Shaft Using Vibration Data
Oliver Mey
Willi Neudeck
André Schneider
Olaf Enge-Rosenblatt
42
30
0
26 May 2020
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
GECToR -- Grammatical Error Correction: Tag, Not Rewrite
Kostiantyn Omelianchuk
Vitaliy Atrasevych
Artem Chernodub
Oleksandr Skurzhanskyi
99
318
0
26 May 2020
ParsBERT: Transformer-based Model for Persian Language Understanding
ParsBERT: Transformer-based Model for Persian Language Understanding
Mehrdad Farahani
Mohammad Gharachorloo
Marzieh Farahani
Mohammad Manthouri
91
210
0
26 May 2020
An Audio-enriched BERT-based Framework for Spoken Multiple-choice
  Question Answering
An Audio-enriched BERT-based Framework for Spoken Multiple-choice Question Answering
Chia-Chih Kuo
Shang-Bao Luo
Kuan-Yu Chen
65
17
0
25 May 2020
NILE : Natural Language Inference with Faithful Natural Language
  Explanations
NILE : Natural Language Inference with Faithful Natural Language Explanations
Sawan Kumar
Partha P. Talukdar
XAILRM
122
163
0
25 May 2020
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other
  Affectual States from Text
Sentiment Analysis: Automatically Detecting Valence, Emotions, and Other Affectual States from Text
Saif M. Mohammad
72
316
0
25 May 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge
  Injection into Pretrained Transformers
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavaš
KELM
82
81
0
24 May 2020
Adversarial NLI for Factual Correctness in Text Summarisation Models
Adversarial NLI for Factual Correctness in Text Summarisation Models
Mario Barrantes
Benedikt Herudek
Richard Wang
52
17
0
24 May 2020
Devising Malware Characterstics using Transformers
Devising Malware Characterstics using Transformers
Simra Shahid
Tanmay Singh
Yash Sharma
Kapil Sharma
39
2
0
23 May 2020
Med-BERT: pre-trained contextualized embeddings on large-scale
  structured electronic health records for disease prediction
Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction
L. Rasmy
Yang Xiang
Z. Xie
Cui Tao
Degui Zhi
AI4MHLM&MA
116
703
0
22 May 2020
Pretraining with Contrastive Sentence Objectives Improves Discourse
  Performance of Language Models
Pretraining with Contrastive Sentence Objectives Improves Discourse Performance of Language Models
Dan Iter
Kelvin Guu
L. Lansing
Dan Jurafsky
78
78
0
20 May 2020
What Makes for Good Views for Contrastive Learning?
What Makes for Good Views for Contrastive Learning?
Yonglong Tian
Chen Sun
Ben Poole
Dilip Krishnan
Cordelia Schmid
Phillip Isola
SSL
122
1,342
0
20 May 2020
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Leveraging Graph to Improve Abstractive Multi-Document Summarization
Wei Li
Xinyan Xiao
Jiachen Liu
Hua Wu
Haifeng Wang
Junping Du
87
136
0
20 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
63
31
0
20 May 2020
Normalized Attention Without Probability Cage
Normalized Attention Without Probability Cage
Oliver Richter
Roger Wattenhofer
91
21
0
19 May 2020
Previous
123...606162...697071
Next