Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
Unsupervised Acquisition of Discrete Grammatical Categories
David Ph. Shakouri
Crit Cremers
Niels O. Schiller
65
0
0
24 Mar 2025
CoMP: Continual Multimodal Pre-training for Vision Foundation Models
Yuxiao Chen
L. Meng
Wujian Peng
Zuxuan Wu
Yu-Gang Jiang
VLM
211
1
0
24 Mar 2025
Detection of Somali-written Fake News and Toxic Messages on the Social Media Using Transformer-based Language Models
Muhidin A. Mohamed
Shuab D. Ahmed
Yahye A. Isse
Hanad M. Mohamed
Fuad Mire Hassan
Houssein A. Assowe
84
0
0
23 Mar 2025
Deceptive Humor: A Synthetic Multilingual Benchmark Dataset for Bridging Fabricated Claims with Humorous Content
Sai Kartheek Reddy Kasu
Shankar Biradar
Sunil Saumya
100
0
0
20 Mar 2025
Model Hubs and Beyond: Analyzing Model Popularity, Performance, and Documentation
Pritam Kadasi
Sriman Reddy
Srivathsa Vamsi Chaturvedula
Rudranshu Sen
Agnish Saha
Soumavo Sikdar
Sayani Sarkar
Suhani Mittal
Rohit Jindal
Mayank Singh
128
0
0
19 Mar 2025
Unified Enhancement of the Generalization and Robustness of Language Models via Bi-Stage Optimization
Yizhou Sun
Juan Yin
Juan Zhao
Fan Zhang
Yongheng Liu
Hongji Chen
62
0
0
19 Mar 2025
ARLED: Leveraging LED-based ARMAN Model for Abstractive Summarization of Persian Long Documents
Samira Zangooei
Amirhossein Darmani
Hossein Farahmand Nezhad
Laya Mahmoudi
86
0
0
13 Mar 2025
DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation
Chen Chen
Rui Qian
Wenze Hu
Tsu-Jui Fu
Jialing Tong
...
Lezhi Li
Bowen Zhang
Alex Schwing
Wei Liu
Yue Yang
143
0
0
13 Mar 2025
Sentiment Analysis in SemEval: A Review of Sentiment Identification Approaches
Bousselham EL HADDAOUI
R. Chiheb
R. Faizi
A. E. Afia
119
0
0
13 Mar 2025
ReSi: A Comprehensive Benchmark for Representational Similarity Measures
Max Klabunde
Tassilo Wald
Tobias Schumacher
Klaus H. Maier-Hein
Markus Strohmaier
Adriana Iamnitchi
AI4TS
VLM
236
6
0
13 Mar 2025
Talk2PC: Enhancing 3D Visual Grounding through LiDAR and Radar Point Clouds Fusion for Autonomous Driving
Runwei Guan
Tao Huang
Ningwei Ouyang
Daizong Liu
Xiaolou Sun
Lianqing Zheng
Ming Xu
Yutao Yue
Hui Xiong
130
1
0
11 Mar 2025
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Mingyue Cheng
Yucong Luo
Jie Ouyang
Qiang Liu
Huijie Liu
...
Bohou Zhang
Jiawei Cao
Jie Ma
Daoyu Wang
Enhong Chen
3DV
157
7
0
11 Mar 2025
Large Language Model as Meta-Surrogate for Data-Driven Many-Task Optimization: A Proof-of-Principle Study
Wei Wei
Yue-Jiao Gong
Jun Zhang
101
0
0
11 Mar 2025
CtrlRAG: Black-box Adversarial Attacks Based on Masked Language Models in Retrieval-Augmented Language Generation
Runqi Sui
AAML
86
1
0
10 Mar 2025
Gender Encoding Patterns in Pretrained Language Model Representations
Mahdi Zakizadeh
Mohammad Taher Pilehvar
208
0
0
09 Mar 2025
Fine-Grained Evaluation for Implicit Discourse Relation Recognition
Xinyi Cai
84
0
0
07 Mar 2025
Layer-Specific Scaling of Positional Encodings for Superior Long-Context Modeling
Zhenghua Wang
Yiran Ding
Changze Lv
Zhibo Xu
Changze Lv
Tianyuan Shi
Xiaoqing Zheng
Xuanjing Huang
73
0
0
06 Mar 2025
PriFFT: Privacy-preserving Federated Fine-tuning of Large Language Models via Hybrid Secret Sharing
Zhichao You
Xuewen Dong
Ke Cheng
Xutong Mu
Jiaxuan Fu
Shiyang Ma
Qiang Qu
Yulong Shen
FedML
113
0
0
05 Mar 2025
Zero-Shot Complex Question-Answering on Long Scientific Documents
Wanting Wang
RALM
82
0
0
04 Mar 2025
Efficient or Powerful? Trade-offs Between Machine Learning and Deep Learning for Mental Illness Detection on Social Media
Zhanyi Ding
Zhongyan Wang
Yeyubei Zhang
Yuchen Cao
Yunchong Liu
Xiaorui Shen
Yexin Tian
Jianglai Dai
AI4MH
127
4
0
03 Mar 2025
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning
Anh Tong
Thanh Nguyen-Tang
Dongeun Lee
Duc Nguyen
Toan M. Tran
David Hall
Cheongwoong Kang
Jaesik Choi
147
1
0
03 Mar 2025
EPEE: Towards Efficient and Effective Foundation Models in Biomedicine
Zaifu Zhan
Shuang Zhou
Huixue Zhou
Ziqiang Liu
Rui Zhang
85
1
0
03 Mar 2025
TimesBERT: A BERT-Style Foundation Model for Time Series Understanding
Haoran Zhang
Yong Liu
Yunzhong Qiu
Haixuan Liu
Zhongyi Pei
Jianmin Wang
Mingsheng Long
AI4TS
68
1
0
28 Feb 2025
Uncertainty Quantification in Retrieval Augmented Question Answering
Laura Perez-Beltrachini
Mirella Lapata
RALM
157
0
0
25 Feb 2025
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
230
6
0
24 Feb 2025
Reasoning with Latent Thoughts: On the Power of Looped Transformers
Nikunj Saunshi
Nishanth Dikkala
Zhiyuan Li
Sanjiv Kumar
Sashank J. Reddi
OffRL
LRM
AI4CE
141
22
0
24 Feb 2025
Robust Bias Detection in MLMs and its Application to Human Trait Ratings
Ingroj Shrestha
Louis Tay
Padmini Srinivasan
117
0
0
24 Feb 2025
Towards Typologically Aware Rescoring to Mitigate Unfaithfulness in Lower-Resource Languages
Tsan Tsai Chan
Xin Tong
Thi Thu Uyen Hoang
Barbare Tepnadze
Wojciech Stempniak
127
0
0
24 Feb 2025
Pay Attention to Real World Perturbations! Natural Robustness Evaluation in Machine Reading Comprehension
Yulong Wu
Viktor Schlegel
Riza Batista-Navarro
AAML
76
0
0
23 Feb 2025
Iterative Auto-Annotation for Scientific Named Entity Recognition Using BERT-Based Models
Kartik Gupta
62
0
0
22 Feb 2025
Comprehensive Analysis of Transparency and Accessibility of ChatGPT, DeepSeek, And other SoTA Large Language Models
Ranjan Sapkota
Shaina Raza
Manoj Karkee
101
7
0
21 Feb 2025
Hyper-SET: Designing Transformers via Hyperspherical Energy Minimization
Yunzhe Hu
Difan Zou
Dong Xu
155
1
0
17 Feb 2025
The underlying structures of self-attention: symmetry, directionality, and emergent dynamics in Transformer training
Matteo Saponati
Pascal Sager
Pau Vilimelis Aceituno
Thilo Stadelmann
Benjamin Grewe
32
1
0
15 Feb 2025
LLM4GNAS: A Large Language Model Based Toolkit for Graph Neural Architecture Search
Yang Gao
Hong Yang
Y. Chen
Junxian Wu
Peng Zhang
Haishuai Wang
84
1
0
12 Feb 2025
Enhancing Auto-regressive Chain-of-Thought through Loop-Aligned Reasoning
Qifan Yu
Zhenyu He
Sijie Li
Xun Zhou
Jun Zhang
Jingjing Xu
Di He
OffRL
LRM
139
5
0
12 Feb 2025
Al-Khwarizmi: Discovering Physical Laws with Foundation Models
Christopher E. Mower
Haitham Bou-Ammar
AI4CE
205
2
0
03 Feb 2025
SecPE: Secure Prompt Ensembling for Private and Robust Large Language Models
Jiawen Zhang
Kejia Chen
Zunlei Feng
Jian Lou
Mingli Song
Qingbin Liu
Xiaoyu Yang
AAML
SILM
FedML
169
1
0
02 Feb 2025
AdditiveLLM: Large Language Models Predict Defects in Additive Manufacturing
P. Pak
A. Farimani
AI4CE
133
1
0
29 Jan 2025
A Review on Self-Supervised Learning for Time Series Anomaly Detection: Recent Advances and Open Challenges
Aitor Sánchez-Ferrera
Borja Calvo
Jose A. Lozano
AI4TS
130
0
0
28 Jan 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
144
0
0
28 Jan 2025
A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics
Kai He
Rui Mao
Qika Lin
Yucheng Ruan
Xiang Lan
Mengling Feng
Min Zhang
LM&MA
AILaw
228
176
0
28 Jan 2025
EDoRA: Efficient Weight-Decomposed Low-Rank Adaptation via Singular Value Decomposition
Hamid Nasiri
Peter Garraghan
73
2
0
21 Jan 2025
Reference-free Evaluation Metrics for Text Generation: A Survey
Takumi Ito
Kees van Deemter
Jun Suzuki
ELM
123
2
0
21 Jan 2025
TAD-Bench: A Comprehensive Benchmark for Embedding-Based Text Anomaly Detection
Yang Cao
Sikun Yang
Chen Li
Haolong Xiang
Lianyong Qi
Bo Liu
Rongsheng Li
Ming Liu
128
0
0
21 Jan 2025
A Contrastive Framework with User, Item and Review Alignment for Recommendation
Hoang V. Dong
Yuan Fang
Hady W. Lauw
447
2
0
21 Jan 2025
Assessing and Enhancing the Robustness of Large Language Models with Task Structure Variations for Logical Reasoning
Qiming Bao
Gaël Gendron
A. Peng
Wanjun Zhong
N. Tan
Yang Chen
Michael Witbrock
Qingbin Liu
LRM
ELM
165
5
0
20 Jan 2025
Harnessing the Potential of Large Language Models in Modern Marketing Management: Applications, Future Directions, and Strategic Recommendations
Raha Aghaei
Ali A. Kiaei
Mahnaz Boush
Javad Vahidi
Mohammad Zavvar
Zeynab Barzegar
Mahan Rofoosheh
OffRL
102
1
0
18 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
282
27
0
17 Jan 2025
Bridging the Fairness Gap: Enhancing Pre-trained Models with LLM-Generated Sentences
Liu Yu
Ludie Guo
Ping Kuang
Fan Zhou
81
1
0
12 Jan 2025
Multi-task Visual Grounding with Coarse-to-Fine Consistency Constraints
Ming Dai
Jian Li
Jiedong Zhuang
Xian Zhang
Wankou Yang
ObjD
88
2
0
12 Jan 2025
Previous
1
2
3
4
5
...
57
58
59
Next