Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1906.08237
Cited By
XLNet: Generalized Autoregressive Pretraining for Language Understanding
19 June 2019
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLNet: Generalized Autoregressive Pretraining for Language Understanding"
50 / 1,306 papers shown
Title
MSStyleTTS: Multi-Scale Style Modeling with Hierarchical Context Information for Expressive Speech Synthesis
Shunwei Lei
Yixuan Zhou
Liyang Chen
Zhiyong Wu
Xixin Wu
Shiyin Kang
Helen Meng
30
7
0
29 Jul 2023
A Hybrid Machine Learning Model for Classifying Gene Mutations in Cancer using LSTM, BiLSTM, CNN, GRU, and GloVe
Sanad Aburass
O. Dorgham
Jamil Al Shaqsi
17
21
0
24 Jul 2023
Pseudo Outlier Exposure for Out-of-Distribution Detection using Pretrained Transformers
Jaeyoung Kim
Kyuheon Jung
Dongbin Na
Sion Jang
Eunbin Park
Sungchul Choi
OODD
35
6
0
18 Jul 2023
Attention over pre-trained Sentence Embeddings for Long Document Classification
Amine Abdaoui
Sourav Dutta
22
1
0
18 Jul 2023
Is Prompt-Based Finetuning Always Better than Vanilla Finetuning? Insights from Cross-Lingual Language Understanding
Bolei Ma
Ercong Nie
Helmut Schmid
Hinrich Schütze
AAML
VLM
LRM
37
8
0
15 Jul 2023
Unsupervised Calibration through Prior Adaptation for Text Classification using Large Language Models
Lautaro Estienne
Luciana Ferrer
Matías Vera
Pablo Piantanida
VLM
34
1
0
13 Jul 2023
A Side-by-side Comparison of Transformers for English Implicit Discourse Relation Classification
Bruce W. Lee
Bongseok Yang
J. Lee
16
0
0
07 Jul 2023
Deep Attention Q-Network for Personalized Treatment Recommendation
Simin Ma
Junghwan Lee
N. Serban
Shihao Yang
OffRL
38
5
0
04 Jul 2023
A Dual-Stream Recurrence-Attention Network With Global-Local Awareness for Emotion Recognition in Textual Dialog
Jiang Li
Xiaoping Wang
Zhigang Zeng
19
4
0
02 Jul 2023
SpATr: MoCap 3D Human Action Recognition based on Spiral Auto-encoder and Transformer Network
Hamza Bouzid
Lahoucine Ballihi
ViT
3DH
22
2
0
30 Jun 2023
Quantizable Transformers: Removing Outliers by Helping Attention Heads Do Nothing
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
21
88
0
22 Jun 2023
Lexical Speaker Error Correction: Leveraging Language Models for Speaker Diarization Error Correction
Rohit Paturi
S. Srinivasan
Xiang Li
18
13
0
15 Jun 2023
Relational Temporal Graph Reasoning for Dual-task Dialogue Language Understanding
Bowen Xing
Ivor W. Tsang
43
13
0
15 Jun 2023
Research on Named Entity Recognition in Improved transformer with R-Drop structure
Weidong Ji
Yousheng Zhang
Guohui Zhou
Xu Wang
28
0
0
14 Jun 2023
Learning to Mask and Permute Visual Tokens for Vision Transformer Pre-Training
Lorenzo Baraldi
Roberto Amoroso
Marcella Cornia
Lorenzo Baraldi
Andrea Pilzer
Rita Cucchiara
38
2
0
12 Jun 2023
QUERT: Continual Pre-training of Language Model for Query Understanding in Travel Domain Search
Jian Xie
Yidan Liang
Jingping Liu
Yanghua Xiao
Baohua Wu
Shenghua Ni
VLM
LRM
30
8
0
11 Jun 2023
GKD: A General Knowledge Distillation Framework for Large-scale Pre-trained Language Model
Shicheng Tan
Weng Lam Tam
Yuanchun Wang
Wenwen Gong
Yang Yang
...
Jiahao Liu
Jingang Wang
Shuo Zhao
Peng-Zhen Zhang
Jie Tang
ALM
MoE
33
11
0
11 Jun 2023
Extensive Evaluation of Transformer-based Architectures for Adverse Drug Events Extraction
Simone Scaboro
Beatrice Portelli
Emmanuele Chersoni
Enrico Santus
Giuseppe Serra
24
8
0
08 Jun 2023
Evaluating Emotion Arcs Across Languages: Bridging the Global Divide in Sentiment Analysis
Daniela Teodorescu
Saif M. Mohammad
30
8
0
03 Jun 2023
Supplementary Features of BiLSTM for Enhanced Sequence Labeling
Conglei Xu
Kun Shen
Hongguang Sun
33
5
0
31 May 2023
The Utility of Large Language Models and Generative AI for Education Research
Andrew Katz
Umair Shakir
B. Chambers
AI4CE
27
6
0
29 May 2023
Modeling Adversarial Attack on Pre-trained Language Models as Sequential Decision Making
Xuanjie Fang
Sijie Cheng
Yang Liu
Wen Wang
AAML
34
9
0
27 May 2023
Entailment as Robust Self-Learner
Jiaxin Ge
Hongyin Luo
Yoon Kim
James R. Glass
42
3
0
26 May 2023
Neural Summarization of Electronic Health Records
Koyena Pal
Seyed Ali Bahrainian
Laura Y. Mercurio
Carsten Eickhoff
25
3
0
24 May 2023
Bi-Drop: Enhancing Fine-tuning Generalization via Synchronous sub-net Estimation and Optimization
Shoujie Tong
Heming Xia
Damai Dai
Runxin Xu
Tianyu Liu
Binghuai Lin
Yunbo Cao
Zhifang Sui
20
0
0
24 May 2023
Difference-Masking: Choosing What to Mask in Continued Pretraining
Alex Wilf
Syeda Nahida Akter
Leena Mathur
Paul Pu Liang
Sheryl Mathew
Mengrou Shou
Eric Nyberg
Louis-Philippe Morency
CLL
SSL
32
4
0
23 May 2023
Towards Zero-shot Relation Extraction in Web Mining: A Multimodal Approach with Relative XML Path
Zilong Wang
Jingbo Shang
49
0
0
23 May 2023
Should We Attend More or Less? Modulating Attention for Fairness
A. Zayed
Gonçalo Mordido
Samira Shabanian
Sarath Chandar
37
10
0
22 May 2023
Bidirectional Transformer Reranker for Grammatical Error Correction
Ying Zhang
Hidetaka Kamigaito
Manabu Okumura
16
2
0
22 May 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
26
12
0
21 May 2023
PINA: Leveraging Side Information in eXtreme Multi-label Classification via Predicted Instance Neighborhood Aggregation
Eli Chien
Jiong Zhang
Cho-Jui Hsieh
Jyun-Yu Jiang
Wei-Cheng Chang
O. Milenkovic
Hsiang-Fu Yu
30
9
0
21 May 2023
Patton: Language Model Pretraining on Text-Rich Networks
Bowen Jin
Wentao Zhang
Yu Zhang
Yu Meng
Xinyang Zhang
Qi Zhu
Jiawei Han
VLM
45
44
0
20 May 2023
A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
Xiaowei Huang
Wenjie Ruan
Wei Huang
Gao Jin
Yizhen Dong
...
Sihao Wu
Peipei Xu
Dengyu Wu
André Freitas
Mustafa A. Mustafa
ALM
45
82
0
19 May 2023
Human Behavioral Benchmarking: Numeric Magnitude Comparison Effects in Large Language Models
Raj Sanjay Shah
Vijay Marupudi
Reba Koenen
Khushi Bhardwaj
Sashank Varma
27
6
0
18 May 2023
NoisywikiHow: A Benchmark for Learning with Real-world Noisy Labels in Natural Language Processing
Tingting Wu
Xiao Ding
Minji Tang
Haotian Zhang
Bing Qin
Ting Liu
NoLa
34
9
0
18 May 2023
Think Outside the Code: Brainstorming Boosts Large Language Models in Code Generation
Xinyu Li
Jiang-Tian Xue
Zheng Xie
Ming Li
LRM
19
26
0
18 May 2023
Statistical Knowledge Assessment for Large Language Models
Qingxiu Dong
Jingjing Xu
Lingpeng Kong
Zhifang Sui
Lei Li
HILM
44
6
0
17 May 2023
CageViT: Convolutional Activation Guided Efficient Vision Transformer
Hao Zheng
Jinbao Wang
Xiantong Zhen
Hao Chen
Jingkuan Song
Feng Zheng
ViT
20
0
0
17 May 2023
Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites
Hans W. A. Hanley
Zakir Durumeric
DeLMO
23
29
0
16 May 2023
Tailoring Instructions to Student's Learning Levels Boosts Knowledge Distillation
Yuxin Ren
Zi-Qi Zhong
Xingjian Shi
Yi Zhu
Chun Yuan
Mu Li
24
7
0
16 May 2023
DLUE: Benchmarking Document Language Understanding
Ruoxi Xu
Hongyu Lin
Xinyan Guan
Xianpei Han
Yingfei Sun
Le Sun
ELM
24
0
0
16 May 2023
Assessing Hidden Risks of LLMs: An Empirical Study on Robustness, Consistency, and Credibility
Wen-song Ye
Mingfeng Ou
Tianyi Li
Yipeng Chen
Xuetao Ma
...
Sai Wu
Jie Fu
Gang Chen
Haobo Wang
J. Zhao
46
36
0
15 May 2023
ParaLS: Lexical Substitution via Pretrained Paraphraser
Jipeng Qiang
Kang Liu
Yun Li
Yunhao Yuan
Yi Zhu
KELM
29
11
0
14 May 2023
EAML: Ensemble Self-Attention-based Mutual Learning Network for Document Image Classification
Souhail Bakkali
Zuheng Ming
Mickael Coustaty
Marçal Rusiñol
10
6
0
11 May 2023
ANALOGICAL -- A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models
Thilini Wijesiriwardene
Ruwan Wickramarachchi
Bimal Gajera
Shreeyash Mukul Gowaikar
Chandan Gupta
Aman Chadha
Aishwarya N. Reganti
Amit P. Sheth
Amitava Das
ELM
25
13
0
08 May 2023
The EarlyBIRD Catches the Bug: On Exploiting Early Layers of Encoder Models for More Efficient Code Classification
Anastasiia Grishina
Max Hort
Leon Moonen
22
6
0
08 May 2023
Differentially Private Attention Computation
Yeqi Gao
Zhao Song
Xin Yang
47
19
0
08 May 2023
Vcc: Scaling Transformers to 128K Tokens or More by Prioritizing Important Tokens
Zhanpeng Zeng
Cole Hawkins
Min-Fong Hong
Aston Zhang
Nikolaos Pappas
Vikas Singh
Shuai Zheng
21
6
0
07 May 2023
Pre-training Language Model as a Multi-perspective Course Learner
Beiduo Chen
Shaohan Huang
Zi-qiang Zhang
Wu Guo
Zhen-Hua Ling
Haizhen Huang
Furu Wei
Weiwei Deng
Qi Zhang
34
0
0
06 May 2023
DiscoPrompt: Path Prediction Prompt Tuning for Implicit Discourse Relation Recognition
Chunkit Chan
Xin Liu
Cheng Jiayang
Zihan Li
Yangqiu Song
Ginny Wong
Simon See
30
30
0
06 May 2023
Previous
1
2
3
4
5
...
25
26
27
Next