Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
BERT Lost Patience Won't Be Robust to Adversarial Slowdown
Zachary Coalson
Gabriel Ritter
Rakesh Bobba
Sanghyun Hong
AAML
47
2
0
29 Oct 2023
Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text Detection
Duke Nguyen
Khaing Myat Noe Naing
Aditya Joshi
61
7
0
29 Oct 2023
Multi-grained Evidence Inference for Multi-choice Reading Comprehension
Yilin Zhao
Hai Zhao
Sufeng Duan
62
2
0
27 Oct 2023
Outlier Dimensions Encode Task-Specific Knowledge
William Rudman
Catherine Chen
Carsten Eickhoff
65
5
0
26 Oct 2023
PETA: Evaluating the Impact of Protein Transfer Learning with Sub-word Tokenization on Downstream Applications
Yang Tan
Mingchen Li
P. Tan
Ziyi Zhou
Huiqun Yu
Guisheng Fan
Liang Hong
65
0
0
26 Oct 2023
Understanding the Role of Input Token Characters in Language Models: How Does Information Loss Affect Performance?
Ahmed Alajrami
Katerina Margatina
Nikolaos Aletras
AAML
65
1
0
26 Oct 2023
Joint Entity and Relation Extraction with Span Pruning and Hypergraph Neural Networks
Zhaohui Yan
Aaron Courville
Wei Liu
Kewei Tu
109
13
0
26 Oct 2023
Apollo: Zero-shot MultiModal Reasoning with Multiple Experts
Daniela Ben-David
Tzuf Paz-Argaman
Reut Tsarfaty
MoE
68
0
0
25 Oct 2023
Kiki or Bouba? Sound Symbolism in Vision-and-Language Models
Morris Alper
Hadar Averbuch-Elor
110
10
0
25 Oct 2023
FedTherapist: Mental Health Monitoring with User-Generated Linguistic Expressions on Smartphones via Federated Learning
Jaemin Shin
Hyungjun Yoon
Seungjoo Lee
Sungjoon Park
Yunxin Liu
Jinho D. Choi
Sung-Ju Lee
70
6
0
25 Oct 2023
Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training
Max Müller-Eberstein
Rob van der Goot
Barbara Plank
Ivan Titov
129
10
0
25 Oct 2023
URL-BERT: Training Webpage Representations via Social Media Engagements
A. Qamar
Chetan Verma
Ahmed El-Kishky
Sumit Binnani
Sneha Mehta
Taylor Berg-Kirkpatrick
60
0
0
25 Oct 2023
CR-COPEC: Causal Rationale of Corporate Performance Changes to Learn from Financial Reports
Ye Eun Chun
Sunjae Kwon
Kyung-Woo Sohn
Nakwon Sung
Junyoup Lee
Byungki Seo
Kevin Compher
Seung-won Hwang
Jaesik Choi
76
1
0
24 Oct 2023
Retrieval-based Knowledge Transfer: An Effective Approach for Extreme Large Language Model Compression
Jiduan Liu
Jiahao Liu
Qifan Wang
Jingang Wang
Xunliang Cai
Dongyan Zhao
Ran Wang
Rui Yan
61
4
0
24 Oct 2023
TRAMS: Training-free Memory Selection for Long-range Language Modeling
Haofei Yu
Cunxiang Wang
Yue Zhang
Wei Bi
RALM
100
6
0
24 Oct 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Kaiyan Zhang
Ning Ding
Biqing Qi
Xuekai Zhu
Xinwei Long
Bowen Zhou
95
5
0
24 Oct 2023
PartialFormer: Modeling Part Instead of Whole for Machine Translation
Tong Zheng
Bei Li
Huiwen Bao
Jiale Wang
Weiqiao Shan
Tong Xiao
Jingbo Zhu
MoE
AI4CE
45
0
0
23 Oct 2023
Unveiling the Multi-Annotation Process: Examining the Influence of Annotation Quantity and Instance Difficulty on Model Performance
Pritam Kadasi
Mayank Singh
57
3
0
23 Oct 2023
PromptCBLUE: A Chinese Prompt Tuning Benchmark for the Medical Domain
Wei-wei Zhu
Xiaoling Wang
Huanran Zheng
Mosha Chen
Buzhou Tang
ELM
LM&MA
69
36
0
22 Oct 2023
Transductive Learning for Textual Few-Shot Classification in API-based Embedding Models
Pierre Colombo
Victor Pellegrain
Malik Boudiaf
Victor Storchan
Myriam Tami
Ismail Ben Ayed
C´eline Hudelot
Pablo Piantanida
99
8
0
21 Oct 2023
A Novel Information-Theoretic Objective to Disentangle Representations for Fair Classification
Pierre Colombo
Nathan Noiry
Guillaume Staerman
Pablo Piantanida
FaML
DRL
78
1
0
21 Oct 2023
Plausibility Processing in Transformer Language Models: Focusing on the Role of Attention Heads in GPT
Soo Hyun Ryu
45
0
0
20 Oct 2023
The Less the Merrier? Investigating Language Representation in Multilingual Models
H. Nigatu
A. Tonja
Jugal Kalita
76
1
0
20 Oct 2023
Unsupervised Candidate Answer Extraction through Differentiable Masker-Reconstructor Model
Zhuoer Wang
Yicheng Wang
Ziwei Zhu
James Caverlee
80
0
0
19 Oct 2023
A Predictive Factor Analysis of Social Biases and Task-Performance in Pretrained Masked Language Models
Yi Zhou
Jose Camacho-Collados
Danushka Bollegala
153
6
0
19 Oct 2023
Boosting Inference Efficiency: Unleashing the Power of Parameter-Shared Pre-trained Language Models
Weize Chen
Xiaoyue Xu
Xu Han
Yankai Lin
Ruobing Xie
Zhiyuan Liu
Maosong Sun
Jie Zhou
39
0
0
19 Oct 2023
Character-level Chinese Backpack Language Models
Hao Sun
John Hewitt
59
0
0
19 Oct 2023
Time-Aware Representation Learning for Time-Sensitive Question Answering
Jungbin Son
Alice Oh
70
6
0
19 Oct 2023
Pretraining Language Models with Text-Attributed Heterogeneous Graphs
Tao Zou
Le Yu
Yifei Huang
Leilei Sun
Bo Du
AI4CE
62
17
0
19 Oct 2023
DepWiGNN: A Depth-wise Graph Neural Network for Multi-hop Spatial Reasoning in Text
Shuaiyi Li
Yang Deng
Wai Lam
93
2
0
19 Oct 2023
SPEED: Speculative Pipelined Execution for Efficient Decoding
Coleman Hooper
Sehoon Kim
Hiva Mohammadzadeh
Hasan Genç
Kurt Keutzer
A. Gholami
Y. Shao
77
41
0
18 Oct 2023
DesignQuizzer: A Community-Powered Conversational Agent for Learning Visual Design
Zhenhui Peng
Qiaoyi Chen
Zhiyu Shen
Xiaojuan Ma
Antti Oulasvirta
54
5
0
18 Oct 2023
Improving Long Document Topic Segmentation Models With Enhanced Coherence Modeling
Hai Yu
Chong Deng
Qinglin Zhang
Jiaqing Liu
Qian Chen
Wen Wang
AI4TS
90
10
0
18 Oct 2023
Chain-of-Thought Tuning: Masked Language Models can also Think Step By Step in Natural Language Understanding
Caoyun Fan
Jidong Tian
Yitian Li
Wenqing Chen
Hao He
Yaohui Jin
LRM
71
4
0
18 Oct 2023
Disentangling the Linguistic Competence of Privacy-Preserving BERT
Stefan Arnold
Nils Kemmerzell
Annika Schreiner
77
0
0
17 Oct 2023
QADYNAMICS: Training Dynamics-Driven Synthetic QA Diagnostic for Zero-Shot Commonsense Question Answering
Haochen Shi
Weiqi Wang
Tianqing Fang
Baixuan Xu
Wenxuan Ding
Xin Liu
Yangqiu Song
113
7
0
17 Oct 2023
Survey of Vulnerabilities in Large Language Models Revealed by Adversarial Attacks
Erfan Shayegani
Md Abdullah Al Mamun
Yu Fu
Pedram Zaree
Yue Dong
Nael B. Abu-Ghazaleh
AAML
238
163
0
16 Oct 2023
PELA: Learning Parameter-Efficient Models with Low-Rank Approximation
Yangyang Guo
Guangzhi Wang
Mohan S. Kankanhalli
38
3
0
16 Oct 2023
Diversifying the Mixture-of-Experts Representation for Language Models with Orthogonal Optimizer
Boan Liu
Liang Ding
Li Shen
Keqin Peng
Yu Cao
Dazhao Cheng
Dacheng Tao
MoE
80
9
0
15 Oct 2023
CarExpert: Leveraging Large Language Models for In-Car Conversational Question Answering
Md. Rony
Christian Suess
Sinchana Ramakanth Bhat
Viju Sudhi
Julia Schneider
Maximilian Vogel
Roman Teucher
Ken E. Friedl
S. Sahoo
71
11
0
14 Oct 2023
Low-Resource Clickbait Spoiling for Indonesian via Question Answering
Ni Putu Intan Maharani
Ayu Purwarianti
Alham Fikri Aji
65
2
0
12 Oct 2023
To token or not to token: A Comparative Study of Text Representations for Cross-Lingual Transfer
Md. Mushfiqur Rahman
Fardin Ahsan Sakib
Fahim Faisal
Antonios Anastasopoulos
60
3
0
12 Oct 2023
Pit One Against Many: Leveraging Attention-head Embeddings for Parameter-efficient Multi-head Attention
Huiyin Xue
Nikolaos Aletras
102
0
0
11 Oct 2023
On the Relationship between Sentence Analogy Identification and Sentence Structure Encoding in Large Language Models
Thilini Wijesiriwardene
Ruwan Wickramarachchi
Aishwarya N. Reganti
Vinija Jain
Aman Chadha
Amit P. Sheth
Amitava Das
56
1
0
11 Oct 2023
The Temporal Structure of Language Processing in the Human Brain Corresponds to The Layered Hierarchy of Deep Language Models
Ariel Goldstein
Eric Ham
Mariano Schain
Samuel A. Nastase
Zaid Zada
...
Avinatan Hassidim
O. Devinsky
A. Flinker
Omer Levy
Uri Hasson
AI4CE
68
10
0
11 Oct 2023
Sparse Universal Transformer
Shawn Tan
Songlin Yang
Zhenfang Chen
Aaron Courville
Chuang Gan
MoE
82
15
0
11 Oct 2023
A Comparative Study of Transformer-based Neural Text Representation Techniques on Bug Triaging
Atish Kumar Dipongkor
Kevin Moran
23
8
0
10 Oct 2023
P5: Plug-and-Play Persona Prompting for Personalized Response Selection
Joosung Lee
Min Sik Oh
Donghun Lee
62
3
0
10 Oct 2023
Model Tuning or Prompt Tuning? A Study of Large Language Models for Clinical Concept and Relation Extraction
C.A.I. Peng
Xi Yang
Kaleb E. Smith
Zehao Yu
Aokun Chen
Jiang Bian
Yonghui Wu
VLM
LRM
85
32
0
10 Oct 2023
Evolution of Natural Language Processing Technology: Not Just Language Processing Towards General Purpose AI
Masahiro Yamamoto
54
1
0
10 Oct 2023
Previous
1
2
3
...
12
13
14
...
57
58
59
Next