ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXivPDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 1,487 papers shown
Title
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech
  Processing
WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Sanyuan Chen
Chengyi Wang
Zhengyang Chen
Yu-Huan Wu
Shujie Liu
...
Yao Qian
Jian Wu
Micheal Zeng
Xiangzhan Yu
Furu Wei
SSL
144
1,721
0
26 Oct 2021
s2s-ft: Fine-Tuning Pretrained Transformer Encoders for
  Sequence-to-Sequence Learning
s2s-ft: Fine-Tuning Pretrained Transformer Encoders for Sequence-to-Sequence Learning
Hangbo Bao
Li Dong
Wenhui Wang
Nan Yang
Furu Wei
21
11
0
26 Oct 2021
Improved Goal Oriented Dialogue via Utterance Generation and Look Ahead
Improved Goal Oriented Dialogue via Utterance Generation and Look Ahead
Hong Huang
Boaz Carmeli
Ateret Anaby-Tavor
37
2
0
24 Oct 2021
Overview of the 2021 Key Point Analysis Shared Task
Overview of the 2021 Key Point Analysis Shared Task
Roni Friedman
Lena Dankin
Yufang Hou
R. Aharonov
Yoav Katz
Noam Slonim
21
22
0
20 Oct 2021
Interpreting Deep Learning Models in Natural Language Processing: A
  Review
Interpreting Deep Learning Models in Natural Language Processing: A Review
Xiaofei Sun
Diyi Yang
Xiaoya Li
Tianwei Zhang
Yuxian Meng
Han Qiu
Guoyin Wang
Eduard H. Hovy
Jiwei Li
24
45
0
20 Oct 2021
Discontinuous Grammar as a Foreign Language
Discontinuous Grammar as a Foreign Language
Daniel Fernández-González
Carlos Gómez-Rodríguez
50
9
0
20 Oct 2021
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text
  Joint Pre-Training
SLAM: A Unified Encoder for Speech and Language Modeling via Speech-Text Joint Pre-Training
Ankur Bapna
Yu-An Chung
Na Wu
Anmol Gulati
Ye Jia
J. Clark
Melvin Johnson
Jason Riesa
Alexis Conneau
Yu Zhang
VLM
64
94
0
20 Oct 2021
LMSOC: An Approach for Socially Sensitive Pretraining
LMSOC: An Approach for Socially Sensitive Pretraining
Vivek Kulkarni
Shubhanshu Mishra
A. Haghighi
22
13
0
20 Oct 2021
Improved Multilingual Language Model Pretraining for Social Media Text
  via Translation Pair Prediction
Improved Multilingual Language Model Pretraining for Social Media Text via Translation Pair Prediction
Shubhanshu Mishra
A. Haghighi
VLM
31
4
0
20 Oct 2021
GNN-LM: Language Modeling based on Global Contexts via GNN
GNN-LM: Language Modeling based on Global Contexts via GNN
Yuxian Meng
Shi Zong
Xiaoya Li
Xiaofei Sun
Tianwei Zhang
Fei Wu
Jiwei Li
LRM
29
37
0
17 Oct 2021
Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning
  for Solving Math Word Problems
Seeking Patterns, Not just Memorizing Procedures: Contrastive Learning for Solving Math Word Problems
Zhongli Li
Wenxuan Zhang
Chao Yan
Qingyu Zhou
Chao Li
Hongzhi Liu
Yunbo Cao
AIMat
41
55
0
16 Oct 2021
A Short Study on Compressing Decoder-Based Language Models
A Short Study on Compressing Decoder-Based Language Models
Tianda Li
Yassir El Mesbahi
I. Kobyzev
Ahmad Rashid
A. Mahmud
Nithin Anchuri
Habib Hajimolahoseini
Yang Liu
Mehdi Rezagholizadeh
95
25
0
16 Oct 2021
Prix-LM: Pretraining for Multilingual Knowledge Base Construction
Prix-LM: Pretraining for Multilingual Knowledge Base Construction
Wenxuan Zhou
Fangyu Liu
Ivan Vulić
Nigel Collier
Muhao Chen
KELM
72
18
0
16 Oct 2021
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Detecting Gender Bias in Transformer-based Models: A Case Study on BERT
Bingbing Li
Hongwu Peng
Rajat Sainju
Junhuan Yang
Lei Yang
Yueying Liang
Weiwen Jiang
Binghui Wang
Hang Liu
Caiwen Ding
32
12
0
15 Oct 2021
Kronecker Decomposition for GPT Compression
Kronecker Decomposition for GPT Compression
Ali Edalati
Marzieh S. Tahaei
Ahmad Rashid
V. Nia
J. Clark
Mehdi Rezagholizadeh
36
33
0
15 Oct 2021
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
SPoT: Better Frozen Model Adaptation through Soft Prompt Transfer
Tu Vu
Brian Lester
Noah Constant
Rami Al-Rfou
Daniel Cer
VLM
LRM
137
278
0
15 Oct 2021
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally
  Across Scales and Tasks
P-Tuning v2: Prompt Tuning Can Be Comparable to Fine-tuning Universally Across Scales and Tasks
Xiao Liu
Kaixuan Ji
Yicheng Fu
Weng Lam Tam
Zhengxiao Du
Zhilin Yang
Jie Tang
VLM
238
816
0
14 Oct 2021
Training Neural Networks for Solving 1-D Optimal Piecewise Linear
  Approximation
Training Neural Networks for Solving 1-D Optimal Piecewise Linear Approximation
Hangcheng Dong
Jing-Xiao Liao
Yan Wang
Yixin Chen
Bingguo Liu
Dong Ye
Guodong Liu
155
0
0
14 Oct 2021
Plug-Tagger: A Pluggable Sequence Labeling Framework Using Language
  Models
Plug-Tagger: A Pluggable Sequence Labeling Framework Using Language Models
Xin Zhou
Ruotian Ma
Tao Gui
Y. Tan
Qi Zhang
Xuanjing Huang
VLM
18
5
0
14 Oct 2021
Building Chinese Biomedical Language Models via Multi-Level Text
  Discrimination
Building Chinese Biomedical Language Models via Multi-Level Text Discrimination
Quan Wang
Songtai Dai
Benfeng Xu
Yajuan Lyu
Yong Zhu
Hua Wu
Haifeng Wang
71
14
0
14 Oct 2021
bert2BERT: Towards Reusable Pretrained Language Models
bert2BERT: Towards Reusable Pretrained Language Models
Cheng Chen
Yichun Yin
Lifeng Shang
Xin Jiang
Yujia Qin
Fengyu Wang
Zhi Wang
Xiao Chen
Zhiyuan Liu
Qun Liu
VLM
34
59
0
14 Oct 2021
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu
Tianxiang Sun
Junliang He
Jiawen Wu
Lingling Wu
Xinyu Zhang
Hao Jiang
Bo Zhao
Xuanjing Huang
Xipeng Qiu
ELM
28
46
0
13 Oct 2021
Automated Essay Scoring Using Transformer Models
Automated Essay Scoring Using Transformer Models
Sabrina Ludwig
Christian W. F. Mayer
Christopher Hansen
Kerstin Eilers
Steffen Brandt
21
39
0
13 Oct 2021
Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree
  Structures Inside Arguments
Semantic Role Labeling as Dependency Parsing: Exploring Latent Tree Structures Inside Arguments
Yu Zhang
Qingrong Xia
Shilin Zhou
Yong-jia Jiang
Guohong Fu
Min Zhang
48
27
0
13 Oct 2021
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
Mengzi: Towards Lightweight yet Ingenious Pre-trained Models for Chinese
ZhuoSheng Zhang
Hanqing Zhang
Keming Chen
Yuhang Guo
Jingyun Hua
Yulong Wang
Ming Zhou
VLM
55
71
0
13 Oct 2021
MDERank: A Masked Document Embedding Rank Approach for Unsupervised
  Keyphrase Extraction
MDERank: A Masked Document Embedding Rank Approach for Unsupervised Keyphrase Extraction
Linhan Zhang
Qian Chen
Wen Wang
Chong Deng
Shiliang Zhang
Bing Li
Wei Wang
Xin Cao
45
56
0
13 Oct 2021
Learning Compact Metrics for MT
Learning Compact Metrics for MT
Amy Pu
Hyung Won Chung
Ankur P. Parikh
Sebastian Gehrmann
Thibault Sellam
38
99
0
12 Oct 2021
Relative Molecule Self-Attention Transformer
Relative Molecule Self-Attention Transformer
Lukasz Maziarka
Dawid Majchrowski
Tomasz Danel
Piotr Gaiñski
Jacek Tabor
Igor T. Podolak
Pawel M. Morkisz
Stanislaw Jastrzebski
MedIm
45
34
0
12 Oct 2021
SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign
  Language Recognition
SignBERT: Pre-Training of Hand-Model-Aware Representation for Sign Language Recognition
Hezhen Hu
Weichao Zhao
Wen-gang Zhou
Yuechen Wang
Houqiang Li
ViT
35
63
0
11 Oct 2021
Supervision Exists Everywhere: A Data Efficient Contrastive
  Language-Image Pre-training Paradigm
Supervision Exists Everywhere: A Data Efficient Contrastive Language-Image Pre-training Paradigm
Yangguang Li
Feng Liang
Lichen Zhao
Yufeng Cui
Wanli Ouyang
Jing Shao
F. Yu
Junjie Yan
VLM
CLIP
50
448
0
11 Oct 2021
Advances in Multi-turn Dialogue Comprehension: A Survey
ZhuoSheng Zhang
Hai Zhao
31
21
0
11 Oct 2021
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Speeding up Deep Model Training by Sharing Weights and Then Unsharing
Shuo Yang
Le Hou
Xiaodan Song
Qiang Liu
Denny Zhou
113
9
0
08 Oct 2021
Using Keypoint Matching and Interactive Self Attention Network to verify
  Retail POSMs
Using Keypoint Matching and Interactive Self Attention Network to verify Retail POSMs
Harshita Seth
Sonaal Kant
Muktabh Mayank Srivastava
34
1
0
07 Oct 2021
Noisy Text Data: Achilles' Heel of popular transformer based NLP models
Noisy Text Data: Achilles' Heel of popular transformer based NLP models
Kartikay Bagla
Ankit Kumar
Shivam Gupta
Anuj Gupta
29
5
0
07 Oct 2021
Capturing Structural Locality in Non-parametric Language Models
Capturing Structural Locality in Non-parametric Language Models
Frank F. Xu
Junxian He
Graham Neubig
Vincent J. Hellendoorn
27
14
0
06 Oct 2021
KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier
KNN-BERT: Fine-Tuning Pre-Trained Models with KNN Classifier
Linyang Li
Demin Song
Ruotian Ma
Xipeng Qiu
Xuanjing Huang
31
21
0
06 Oct 2021
Using Psuedolabels for training Sentiment Classifiers makes the model
  generalize better across datasets
Using Psuedolabels for training Sentiment Classifiers makes the model generalize better across datasets
N. Reddy
Muktabh Mayank Srivastava
24
0
0
05 Oct 2021
Is Attention always needed? A Case Study on Language Identification from
  Speech
Is Attention always needed? A Case Study on Language Identification from Speech
A. Mandal
Santanu Pal
Indranil Dutta
Mahidas Bhattacharya
S. Naskar
27
6
0
05 Oct 2021
Autoregressive Diffusion Models
Autoregressive Diffusion Models
Emiel Hoogeboom
Alexey A. Gritsenko
Jasmijn Bastings
Ben Poole
Rianne van den Berg
Tim Salimans
DiffM
47
146
0
05 Oct 2021
A Survey On Neural Word Embeddings
A Survey On Neural Word Embeddings
Erhan Sezerer
Selma Tekir
AI4TS
28
12
0
05 Oct 2021
Classification of hierarchical text using geometric deep learning: the
  case of clinical trials corpus
Classification of hierarchical text using geometric deep learning: the case of clinical trials corpus
Sohrab Ferdowsi
Nikolay Borissov
J. Knafou
P. Amini
Douglas Teodoro
16
7
0
04 Oct 2021
Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label
  Text Classification
Fast Multi-Resolution Transformer Fine-tuning for Extreme Multi-label Text Classification
Jiong Zhang
Wei-Cheng Chang
Hsiang-Fu Yu
Inderjit S. Dhillon
29
99
0
01 Oct 2021
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing
  Language Models
Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models
Robert Wolfe
Aylin Caliskan
95
51
0
01 Oct 2021
Fine-tuning wav2vec2 for speaker recognition
Fine-tuning wav2vec2 for speaker recognition
Nik Vaessen
David A. van Leeuwen
49
107
0
30 Sep 2021
First to Possess His Statistics: Data-Free Model Extraction Attack on
  Tabular Data
First to Possess His Statistics: Data-Free Model Extraction Attack on Tabular Data
Masataka Tasumi
Kazuki Iwahana
Naoto Yanai
Katsunari Shishido
Toshiya Shimizu
Yuji Higuchi
I. Morikawa
Jun Yajima
AAML
30
4
0
30 Sep 2021
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text
  Understanding
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding
Hu Xu
Gargi Ghosh
Po-Yao (Bernie) Huang
Dmytro Okhonko
Armen Aghajanyan
Florian Metze
Luke Zettlemoyer
Florian Metze Luke Zettlemoyer Christoph Feichtenhofer
CLIP
VLM
259
561
0
28 Sep 2021
What to Prioritize? Natural Language Processing for the Development of a
  Modern Bug Tracking Solution in Hardware Development
What to Prioritize? Natural Language Processing for the Development of a Modern Bug Tracking Solution in Hardware Development
T. Do
Markus Dobler
Niklas Kühl
25
0
0
28 Sep 2021
TURINGBENCH: A Benchmark Environment for Turing Test in the Age of
  Neural Text Generation
TURINGBENCH: A Benchmark Environment for Turing Test in the Age of Neural Text Generation
Adaku Uchendu
Zeyu Ma
Thai Le
Rui Zhang
Dongwon Lee
DeLMO
31
124
0
27 Sep 2021
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual
  Question Answering
VQA-MHUG: A Gaze Dataset to Study Multimodal Neural Attention in Visual Question Answering
Ekta Sood
Fabian Kögel
Florian Strohm
Prajit Dhar
Andreas Bulling
42
19
0
27 Sep 2021
Context-guided Triple Matching for Multiple Choice Question Answering
Context-guided Triple Matching for Multiple Choice Question Answering
Xun Yao
Junlong Ma
Xinrong Hu
Junping Liu
Jie Yang
Wanqing Li
24
2
0
27 Sep 2021
Previous
123...141516...282930
Next