ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1906.08237
  4. Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
v1v2 (latest)

XLNet: Generalized Autoregressive Pretraining for Language Understanding

19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
    AI4CE
ArXiv (abs)PDFHTML

Papers citing "XLNet: Generalized Autoregressive Pretraining for Language Understanding"

50 / 3,518 papers shown
Title
A thorough benchmark of automatic text classification: From traditional approaches to large language models
A thorough benchmark of automatic text classification: From traditional approaches to large language models
Washington Cunha
Leonardo Rocha
M. A. Gonçalves
VLM
88
1
0
02 Apr 2025
Is Less Really More? Fake News Detection with Limited Information
Is Less Really More? Fake News Detection with Limited Information
Zhaoyang Cao
John Nguyen
Reza Zafarani
84
0
0
02 Apr 2025
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
COST: Contrastive One-Stage Transformer for Vision-Language Small Object Tracking
Chunhui Zhang
Li Liu
Jialin Gao
Xin Sun
Hao Wen
Xi Zhou
Shiming Ge
Yucheng Wang
110
1
0
02 Apr 2025
Enhancing Negation Awareness in Universal Text Embeddings: A Data-efficient and Computational-efficient Approach
Enhancing Negation Awareness in Universal Text Embeddings: A Data-efficient and Computational-efficient Approach
Hongliu Cao
111
1
0
01 Apr 2025
Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models
Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models
Lynnette Ng
Kokil Jaidka
Kaiyuan Tay
Hansin Ahuja
Niyati Chhaya
131
1
0
26 Mar 2025
AutoRad-Lung: A Radiomic-Guided Prompting Autoregressive Vision-Language Model for Lung Nodule Malignancy Prediction
AutoRad-Lung: A Radiomic-Guided Prompting Autoregressive Vision-Language Model for Lung Nodule Malignancy Prediction
Sadaf Khademi
Mehran Shabanpour
Reza Taleei
A. Oikonomou
Arash Mohammadi
MedIm
103
0
0
26 Mar 2025
A Retrieval-Based Approach to Medical Procedure Matching in Romanian
A Retrieval-Based Approach to Medical Procedure Matching in Romanian
Andrei Niculae
Adrian Cosma
Emilian Radoi
116
0
0
26 Mar 2025
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content
Sai Kartheek Reddy Kasu
Shankar Biradar
Sunil Saumya
103
0
0
20 Mar 2025
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation
Andrea Maracani
Savas Ozkan
Sijun Cho
Hyowon Kim
Eunchung Noh
Jeongwon Min
Cho Jung Min
Dookun Park
Mete Ozay
147
0
0
20 Mar 2025
Unified Enhancement of the Generalization and Robustness of Language Models via Bi-Stage Optimization
Unified Enhancement of the Generalization and Robustness of Language Models via Bi-Stage Optimization
Yizhou Sun
Juan Yin
Juan Zhao
Fan Zhang
Yongheng Liu
Hongji Chen
62
0
0
19 Mar 2025
Spotting Persuasion: A Low-cost Model for Persuasion Detection in Political Ads on Social Media
Spotting Persuasion: A Low-cost Model for Persuasion Detection in Political Ads on Social Media
Elyas Meguellati
Stefano Civelli
Pietro Bernardelle
S. Sadiq
Gianluca Demartini
70
0
0
18 Mar 2025
Can Large Vision Language Models Read Maps Like a Human?
Can Large Vision Language Models Read Maps Like a Human?
Shuo Xing
Zezhou Sun
Shuangyu Xie
Kaiyuan Chen
Yanjia Huang
Yuping Wang
Jiachen Li
Dezhen Song
Zhengzhong Tu
142
8
0
18 Mar 2025
A Survey on Federated Fine-tuning of Large Language Models
A Survey on Federated Fine-tuning of Large Language Models
Yebo Wu
Chunlin Tian
Jingguang Li
He Sun
Kahou Tam
Zhanting Zhou
Haicheng Liao
Zhijiang Guo
Li Li
Chengzhong Xu
FedML
154
5
0
15 Mar 2025
How Well Does Your Tabular Generator Learn the Structure of Tabular Data?
Xiangjian Jiang
Nikola Simidjievski
M. Jamnik
LMTD
130
0
0
13 Mar 2025
KV-Distill: Nearly Lossless Learnable Context Compression for LLMs
Vivek Chari
Guanghui Qin
Benjamin Van Durme
VLM
105
2
0
13 Mar 2025
Sentiment Analysis in SemEval: A Review of Sentiment Identification Approaches
Bousselham EL HADDAOUI
R. Chiheb
R. Faizi
A. E. Afia
119
0
0
13 Mar 2025
Towards Graph Foundation Models: A Transferability Perspective
Yansen Wang
Wenqi Fan
Suhang Wang
Yao Ma
88
1
0
13 Mar 2025
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More
Arvid Frydenlund
LRM
175
0
0
13 Mar 2025
ARLED: Leveraging LED-based ARMAN Model for Abstractive Summarization of Persian Long Documents
Samira Zangooei
Amirhossein Darmani
Hossein Farahmand Nezhad
Laya Mahmoudi
86
0
0
13 Mar 2025
Autoregressive Image Generation with Randomized Parallel Decoding
Haopeng Li
Jinyue Yang
Guoqi Li
Huan Wang
100
1
0
13 Mar 2025
LabelCoRank: Revolutionizing Long Tail Multi-Label Classification with Co-Occurrence Reranking
Yan Yan
Junyuan Liu
Bo Zhang
65
0
0
11 Mar 2025
Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs
Gonzalo Mancera
Daniel DeAlcala
Julian Fierrez
Ruben Tolosana
Aythami Morales
116
1
0
10 Mar 2025
eMoE: Task-aware Memory Efficient Mixture-of-Experts-Based (MoE) Model Inference
Suraiya Tairin
Shohaib Mahmud
Haiying Shen
Anand Iyer
MoE
432
1
0
10 Mar 2025
Learning-Order Autoregressive Models with Application to Molecular Graph Generation
Zhe Wang
Jiaxin Shi
N. Heess
Arthur Gretton
Michalis K. Titsias
98
2
0
07 Mar 2025
UniNet: A Unified Multi-granular Traffic Modeling Framework for Network Security
UniNet: A Unified Multi-granular Traffic Modeling Framework for Network Security
Binghui Wu
D. Divakaran
M. Gurusamy
93
0
0
06 Mar 2025
An Optimization Algorithm for Multimodal Data Alignment
Wei Zhang
Xinyu Wang
Lan Yu
S. Li
66
0
0
05 Mar 2025
Zero-Shot Complex Question-Answering on Long Scientific Documents
Wanting Wang
RALM
82
0
0
04 Mar 2025
Wavelet-Driven Masked Image Modeling: A Path to Efficient Visual Representation
Wenzhao Xiang
Chang Liu
Hongyang Yu
Xilin Chen
77
0
0
02 Mar 2025
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Retrieval Backward Attention without Additional Training: Enhance Embeddings of Large Language Models via Repetition
Yifei Duan
Raphael Shang
Deng Liang
Yongqiang Cai
128
0
0
28 Feb 2025
Revisiting Kernel Attention with Correlated Gaussian Process Representation
Revisiting Kernel Attention with Correlated Gaussian Process Representation
Long Minh Bui
Tho Tran Huu
Duy-Tung Dinh
T. Nguyen
Trong Nghia Hoang
127
2
0
27 Feb 2025
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
The Sharpness Disparity Principle in Transformers for Accelerating Language Model Pre-Training
Jinbo Wang
Mingze Wang
Zhanpeng Zhou
Junchi Yan
Weinan E
Lei Wu
152
2
0
26 Feb 2025
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Exploring Graph Tasks with Pure LLMs: A Comprehensive Benchmark and Investigation
Yansen Wang
Xinnan Dai
Wenqi Fan
Yao Ma
144
2
0
26 Feb 2025
CAMEx: Curvature-aware Merging of Experts
CAMEx: Curvature-aware Merging of Experts
Dung V. Nguyen
Minh H. Nguyen
Luc Q. Nguyen
R. Teo
T. Nguyen
Linh Duy Tran
MoMe
177
4
0
26 Feb 2025
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
LDGen: Enhancing Text-to-Image Synthesis via Large Language Model-Driven Language Representation
Pengzhi Li
Pengfei Yu
Zide Liu
Wei He
Xuhao Pan
Xudong Rao
Tao Wei
Wei Chen
VLM
155
0
0
25 Feb 2025
Predicting Through Generation: Why Generation Is Better for Prediction
Predicting Through Generation: Why Generation Is Better for Prediction
Md. Kowsher
Nusrat Jahan Prottasha
Prakash Bhat
Chun-Nam Yu
Mojtaba Soltanalian
Ivan Garibay
O. Garibay
Chen Chen
Niloofar Yousefi
AI4TS
244
1
0
25 Feb 2025
How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching
How Vital is the Jurisprudential Relevance: Law Article Intervened Legal Case Retrieval and Matching
Nuo Xu
Peijie Wang
Zi Liang
Junzhou Zhao
X. Guan
AILaw
115
0
0
25 Feb 2025
Enhancing Text Classification with a Novel Multi-Agent Collaboration Framework Leveraging BERT
Enhancing Text Classification with a Novel Multi-Agent Collaboration Framework Leveraging BERT
Hediyeh Baban
Sai A Pidapar
Aashutosh Nema
Sichen Lu
LLMAG
134
0
0
25 Feb 2025
Streaming Looking Ahead with Token-level Self-reward
Han Zhang
Ruixin Hong
Dong Yu
76
2
0
24 Feb 2025
Detecting Code Vulnerabilities with Heterogeneous GNN Training
Detecting Code Vulnerabilities with Heterogeneous GNN Training
Yu Luo
Weifeng Xu
Dianxiang Xu
111
0
0
24 Feb 2025
Model Privacy: A Unified Framework to Understand Model Stealing Attacks and Defenses
Model Privacy: A Unified Framework to Understand Model Stealing Attacks and Defenses
G. Wang
Yuhong Yang
Jie Ding
60
1
0
24 Feb 2025
CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization
CSTRL: Context-Driven Sequential Transfer Learning for Abstractive Radiology Report Summarization
Mst. Fahmida Sultana Naznin
Adnan Ibney Faruq
Mostafa Rifat Tazwar
Md Jobayer
Md. Mehedi Hasan Shawon
Md Rakibul Hasan
MedIm
66
0
0
21 Feb 2025
IPAD: Inverse Prompt for AI Detection -- A Robust and Explainable LLM-Generated Text Detector
IPAD: Inverse Prompt for AI Detection -- A Robust and Explainable LLM-Generated Text Detector
Zheng Chen
Yushi Feng
Changyang He
Yue Deng
Hongxi Pu
Yue Liu
DeLMO
81
1
0
21 Feb 2025
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Ranjan Sapkota
Shaina Raza
Manoj Karkee
101
7
0
21 Feb 2025
What Are They Filtering Out? A Survey of Filtering Strategies for Harm Reduction in Pretraining Datasets
Marco Antonio Stranisci
Christian Hardmeier
165
1
0
17 Feb 2025
Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification
Graph-based Retrieval Augmented Generation for Dynamic Few-shot Text Classification
Yubo Wang
Haoyang Li
Fei Teng
Lei Chen
184
1
0
17 Feb 2025
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training
Matteo Saponati
Pascal Sager
Pau Vilimelis Aceituno
Thilo Stadelmann
Benjamin Grewe
35
1
0
15 Feb 2025
Handwritten Text Recognition: A Survey
Handwritten Text Recognition: A Survey
Carlos Garrido-Munoz
Antonio Ríos-Vila
Jorge Calvo-Zaragoza
137
0
0
12 Feb 2025
Context information can be more important than reasoning for time series forecasting with a large language model
Context information can be more important than reasoning for time series forecasting with a large language model
Janghoon Yang
AI4TSLRM
108
1
0
08 Feb 2025
Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes
Lexical Substitution is not Synonym Substitution: On the Importance of Producing Contextually Relevant Word Substitutes
Juraj Vladika
Stephen Meisenbacher
Florian Matthes
260
0
0
06 Feb 2025
A Framework for Double-Blind Federated Adaptation of Foundation Models
A Framework for Double-Blind Federated Adaptation of Foundation Models
Nurbek Tastan
Karthik Nandakumar
FedML
77
0
0
03 Feb 2025
Previous
12345...697071
Next