ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,521 papers shown
Title
A Unique Training Strategy to Enhance Language Models Capabilities for
  Health Mention Detection from Social Media Content
A Unique Training Strategy to Enhance Language Models Capabilities for Health Mention Detection from Social Media Content
Pervaiz Iqbal Khan
Muhammad Nabeel Asim
Andreas Dengel
Sheraz Ahmed
35
1
0
29 Oct 2023
Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text
  Detection
Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text Detection
Duke Nguyen
Khaing Myat Noe Naing
Aditya Joshi
65
7
0
29 Oct 2023
When Reviewers Lock Horn: Finding Disagreement in Scientific Peer
  Reviews
When Reviewers Lock Horn: Finding Disagreement in Scientific Peer Reviews
Sandeep Kumar
Tirthankar Ghosal
Asif Ekbal
101
1
0
28 Oct 2023
Large Language Models Are Better Adversaries: Exploring Generative
  Clean-Label Backdoor Attacks Against Text Classifiers
Large Language Models Are Better Adversaries: Exploring Generative Clean-Label Backdoor Attacks Against Text Classifiers
Wencong You
Zayd Hammoudeh
Daniel Lowd
AAML
49
15
0
28 Oct 2023
FP8-LM: Training FP8 Large Language Models
FP8-LM: Training FP8 Large Language Models
Houwen Peng
Kan Wu
Yixuan Wei
Guoshuai Zhao
Yuxiang Yang
...
Zheng Zhang
Shuguang Liu
Joe Chau
Han Hu
Peng Cheng
MQ
111
45
0
27 Oct 2023
Multi-grained Evidence Inference for Multi-choice Reading Comprehension
Multi-grained Evidence Inference for Multi-choice Reading Comprehension
Yilin Zhao
Hai Zhao
Sufeng Duan
64
2
0
27 Oct 2023
Sliceformer: Make Multi-head Attention as Simple as Sorting in
  Discriminative Tasks
Sliceformer: Make Multi-head Attention as Simple as Sorting in Discriminative Tasks
Shen Yuan
Hongteng Xu
66
0
0
26 Oct 2023
An Ensemble Method Based on the Combination of Transformers with
  Convolutional Neural Networks to Detect Artificially Generated Text
An Ensemble Method Based on the Combination of Transformers with Convolutional Neural Networks to Detect Artificially Generated Text
Vijini Liyanage
Davide Buscaldi
DeLMO
63
3
0
26 Oct 2023
Understanding the Role of Input Token Characters in Language Models: How
  Does Information Loss Affect Performance?
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
67
1
0
26 Oct 2023
Apollo: Zero-shot MultiModal Reasoning with Multiple Experts
Apollo: Zero-shot MultiModal Reasoning with Multiple Experts
Daniela Ben-David
Tzuf Paz-Argaman
Reut Tsarfaty
MoE
73
0
0
25 Oct 2023
PROMINET: Prototype-based Multi-View Network for Interpretable Email
  Response Prediction
PROMINET: Prototype-based Multi-View Network for Interpretable Email Response Prediction
Yuqing Wang
Prashanth Vijayaraghavan
Ehsan Degan
63
4
0
25 Oct 2023
General Point Model with Autoencoding and Autoregressive
General Point Model with Autoencoding and Autoregressive
Zhe Li
Zhangyang Gao
Cheng Tan
Stan Z. Li
Laurence T. Yang
AI4CE3DPC
52
4
0
25 Oct 2023
URL-BERT: Training Webpage Representations via Social Media Engagements
URL-BERT: Training Webpage Representations via Social Media Engagements
A. Qamar
Chetan Verma
Ahmed El-Kishky
Sumit Binnani
Sneha Mehta
Taylor Berg-Kirkpatrick
62
0
0
25 Oct 2023
I$^2$MD: 3D Action Representation Learning with Inter- and Intra-modal
  Mutual Distillation
I2^22MD: 3D Action Representation Learning with Inter- and Intra-modal Mutual Distillation
Yunyao Mao
Jiajun Deng
Wen-gang Zhou
Zhenbo Lu
Wanli Ouyang
Houqiang Li
VLM
85
1
0
24 Oct 2023
Meta learning with language models: Challenges and opportunities in the
  classification of imbalanced text
Meta learning with language models: Challenges and opportunities in the classification of imbalanced text
Apostol T. Vassilev
Honglan Jin
Munawar Hasan
67
0
0
23 Oct 2023
Harnessing Attention Mechanisms: Efficient Sequence Reduction using
  Attention-based Autoencoders
Harnessing Attention Mechanisms: Efficient Sequence Reduction using Attention-based Autoencoders
Daniel Biermann
Fabrizio Palumbo
Morten Goodwin
Ole-Christoffer Granmo
107
0
0
23 Oct 2023
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for
  Social Media NLP Research
SuperTweetEval: A Challenging, Unified and Heterogeneous Benchmark for Social Media NLP Research
Dimosthenis Antypas
Asahi Ushio
Francesco Barbieri
Leonardo Neves
Kiamehr Rezaee
Luis Espinosa-Anke
Jiaxin Pei
Jose Camacho-Collados
71
10
0
23 Oct 2023
A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future
  Directions
A Survey on LLM-Generated Text Detection: Necessity, Methods, and Future Directions
Junchao Wu
Shu Yang
Runzhe Zhan
Yulin Yuan
Derek F. Wong
Lidia S. Chao
DeLMO
106
33
0
23 Oct 2023
Unveiling the Multi-Annotation Process: Examining the Influence of
  Annotation Quantity and Instance Difficulty on Model Performance
Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance
Pritam Kadasi
Mayank Singh
59
3
0
23 Oct 2023
Leveraging Knowledge Graphs for Orphan Entity Allocation in Resume
  Processing
Leveraging Knowledge Graphs for Orphan Entity Allocation in Resume Processing
Aagam Bakliwal
Shubham Manish Gandhi
Y. Haribhakta
35
1
0
21 Oct 2023
MeaeQ: Mount Model Extraction Attacks with Efficient Queries
MeaeQ: Mount Model Extraction Attacks with Efficient Queries
Chengwei Dai
Minxuan Lv
Kun Li
Wei Zhou
AAML
70
5
0
21 Oct 2023
Transductive Learning for Textual Few-Shot Classification in API-based
  Embedding Models
Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models
Pierre Colombo
Victor Pellegrain
Malik Boudiaf
Victor Storchan
Myriam Tami
Ismail Ben Ayed
C´eline Hudelot
Pablo Piantanida
101
8
0
21 Oct 2023
Foundation Model's Embedded Representations May Detect Distribution
  Shift
Foundation Model's Embedded Representations May Detect Distribution Shift
Max Vargas
Adam Tsou
A. Engel
Tony Chiang
70
1
0
20 Oct 2023
Better to Ask in English: Cross-Lingual Evaluation of Large Language
  Models for Healthcare Queries
Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries
Yiqiao Jin
Mohit Chandra
Gaurav Verma
Yibo Hu
Munmun De Choudhury
Srijan Kumar
LM&MAELM
159
76
0
19 Oct 2023
A Predictive Factor Analysis of Social Biases and Task-Performance in
  Pretrained Masked Language Models
A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models
Yi Zhou
Jose Camacho-Collados
Danushka Bollegala
161
6
0
19 Oct 2023
Generative Marginalization Models
Generative Marginalization Models
Sulin Liu
Peter J. Ramadge
Ryan P. Adams
76
1
0
19 Oct 2023
The Locality and Symmetry of Positional Encodings
The Locality and Symmetry of Positional Encodings
Lihu Chen
Gaël Varoquaux
Fabian M. Suchanek
71
1
0
19 Oct 2023
Pretraining Language Models with Text-Attributed Heterogeneous Graphs
Pretraining Language Models with Text-Attributed Heterogeneous Graphs
Tao Zou
Le Yu
Yifei Huang
Leilei Sun
Bo Du
AI4CE
62
17
0
19 Oct 2023
Uncertainty-aware Parameter-Efficient Self-training for Semi-supervised
  Language Understanding
Uncertainty-aware Parameter-Efficient Self-training for Semi-supervised Language Understanding
Jianing Wang
Qiushi Sun
Nuo Chen
Chengyu Wang
Jun Huang
Ming Gao
Xiang Li
UQLM
66
4
0
19 Oct 2023
The Sentiment Problem: A Critical Survey towards Deconstructing
  Sentiment Analysis
The Sentiment Problem: A Critical Survey towards Deconstructing Sentiment Analysis
Pranav Narayanan Venkit
Mukund Srinath
Sanjana Gautam
Saranya Venkatraman
Vipul Gupta
R. Passonneau
Shomir Wilson
84
15
0
18 Oct 2023
Nonet at SemEval-2023 Task 6: Methodologies for Legal Evaluation
Nonet at SemEval-2023 Task 6: Methodologies for Legal Evaluation
S. Nigam
Aniket Deroy
Noel Shallum
Ayush Kumar Mishra
Anup Roy
Shubham Kumar Mishra
Arnab Bhattacharya
Saptarshi Ghosh
Kripabandhu Ghosh
AILawELM
80
11
0
17 Oct 2023
Survey of Vulnerabilities in Large Language Models Revealed by
  Adversarial Attacks
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
241
164
0
16 Oct 2023
FiLM: Fill-in Language Models for Any-Order Generation
FiLM: Fill-in Language Models for Any-Order Generation
Tianxiao Shen
Hao-Chun Peng
Ruoqi Shen
Yao Fu
Zaïd Harchaoui
Yejin Choi
95
10
0
15 Oct 2023
DropMix: Better Graph Contrastive Learning with Harder Negative Samples
DropMix: Better Graph Contrastive Learning with Harder Negative Samples
Yueqi Ma
Minjie Chen
Xiang Li
SSL
52
1
0
15 Oct 2023
CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
CAPro: Webly Supervised Learning with Cross-Modality Aligned Prototypes
Yulei Qin
Xingyu Chen
Yunhang Shen
Chaoyou Fu
Yun Gu
Ke Li
Xing Sun
Rongrong Ji
111
3
0
15 Oct 2023
Domain-Specific Language Model Post-Training for Indonesian Financial
  NLP
Domain-Specific Language Model Post-Training for Indonesian Financial NLP
Ni Putu Intan Maharani
Yoga Yustiawan
Fauzy Caesar Rochim
Ayu Purwarianti
34
1
0
15 Oct 2023
Overview of ImageArg-2023: The First Shared Task in Multimodal Argument
  Mining
Overview of ImageArg-2023: The First Shared Task in Multimodal Argument Mining
Zhexiong Liu
Mohamed Elarby
Yang Zhong
Diane Litman
56
11
0
15 Oct 2023
Expanding the Vocabulary of BERT for Knowledge Base Construction
Expanding the Vocabulary of BERT for Knowledge Base Construction
Dong Yang
Xu Wang
Remzi Celebi
KELM
23
1
0
12 Oct 2023
On the Relationship between Sentence Analogy Identification and Sentence
  Structure Encoding in Large Language Models
On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models
Thilini Wijesiriwardene
Ruwan Wickramarachchi
Aishwarya N. Reganti
Vinija Jain
Aman Chadha
Amit P. Sheth
Amitava Das
79
1
0
11 Oct 2023
Language Models As Semantic Indexers
Language Models As Semantic Indexers
Bowen Jin
Hansi Zeng
Guoyin Wang
Xiusi Chen
Tianxin Wei
...
Yang Li
Hanqing Lu
Suhang Wang
Jiawei Han
Xianfeng Tang
RALM
88
20
0
11 Oct 2023
Fast-ELECTRA for Efficient Pre-training
Fast-ELECTRA for Efficient Pre-training
Chengyu Dong
Liyuan Liu
Hao Cheng
Jingbo Shang
Jianfeng Gao
Xiaodong Liu
79
2
0
11 Oct 2023
PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a
  Language Model
PHALM: Building a Knowledge Graph from Scratch by Prompting Humans and a Language Model
Tatsuya Ide
Eiki Murata
Daisuke Kawahara
T. Yamazaki
Shengzhe Li
K. Shinzato
Toshinori Sato
LRM
102
2
0
11 Oct 2023
The Temporal Structure of Language Processing in the Human Brain
  Corresponds to The Layered Hierarchy of Deep Language Models
The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models
Ariel Goldstein
Eric Ham
Mariano Schain
Samuel A. Nastase
Zaid Zada
...
Avinatan Hassidim
O. Devinsky
A. Flinker
Omer Levy
Uri Hasson
AI4CE
68
10
0
11 Oct 2023
Argumentative Stance Prediction: An Exploratory Study on Multimodality
  and Few-Shot Learning
Argumentative Stance Prediction: An Exploratory Study on Multimodality and Few-Shot Learning
Arushi Sharma
Abhibha Gupta
Maneesh Bilalpur
58
6
0
11 Oct 2023
GPT-4 as an Agronomist Assistant? Answering Agriculture Exams Using
  Large Language Models
GPT-4 as an Agronomist Assistant? Answering Agriculture Exams Using Large Language Models
B. Silva
Leonardo Nunes
Roberto Estevão
Vijay Aski
Ranveer Chandra
ELMLM&MA
98
13
0
10 Oct 2023
LLM for SoC Security: A Paradigm Shift
LLM for SoC Security: A Paradigm Shift
Dipayan Saha
Shams Tarek
Katayoon Yahyaei
S. Saha
Jingbo Zhou
M. Tehranipoor
Farimah Farahmandi
175
55
0
09 Oct 2023
HyperAttention: Long-context Attention in Near-Linear Time
HyperAttention: Long-context Attention in Near-Linear Time
Insu Han
Rajesh Jayaram
Amin Karbasi
Vahab Mirrokni
David P. Woodruff
A. Zandieh
118
74
0
09 Oct 2023
LLMLingua: Compressing Prompts for Accelerated Inference of Large
  Language Models
LLMLingua: Compressing Prompts for Accelerated Inference of Large Language Models
Huiqiang Jiang
Qianhui Wu
Chin-Yew Lin
Yuqing Yang
Lili Qiu
118
119
0
09 Oct 2023
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient
  Vision Transformers
Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Shiyue Cao
Yueqin Yin
Lianghua Huang
Yu Liu
Xin Zhao
Deli Zhao
Kaiqi Huang
ViT
93
19
0
09 Oct 2023
Improving Discriminative Multi-Modal Learning with Large-Scale
  Pre-Trained Models
Improving Discriminative Multi-Modal Learning with Large-Scale Pre-Trained Models
Chenzhuang Du
Yue Zhao
Chonghua Liao
Jiacheng You
Jie Fu
Hang Zhao
91
2
0
08 Oct 2023
Previous
123...111213...697071
Next