Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1909.11942
Cited By
v1
v2
v3
v4
v5
v6 (latest)
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
26 September 2019
Zhenzhong Lan
Mingda Chen
Sebastian Goodman
Kevin Gimpel
Piyush Sharma
Radu Soricut
SSL
AIMat
Re-assign community
ArXiv (abs)
PDF
HTML
Github (3271★)
Papers citing
"ALBERT: A Lite BERT for Self-supervised Learning of Language Representations"
50 / 2,935 papers shown
Title
All is Not Lost: LLM Recovery without Checkpoints
Nikolay Blagoev
Oğuzhan Ersoy
Lydia Yiyu Chen
32
0
0
18 Jun 2025
Enhancing Hyperbole and Metaphor Detection with Their Bidirectional Dynamic Interaction and Emotion Knowledge
Li Zheng
Sihang Wang
Hao Fei
Zuquan Peng
Fei Li
Jianming Fu
Chong Teng
Donghong Ji
15
0
0
18 Jun 2025
FASCIST-O-METER: Classifier for Neo-fascist Discourse Online
Rudy Alexandro Garrido Veliz
Martin Semmann
Chris Biemann
Seid Muhie Yimam
122
0
0
12 Jun 2025
Latent Multi-Head Attention for Small Language Models
Sushant Mehta
Raj Abhijit Dandekar
Rajat Dandekar
Sreedath Panat
RALM
41
0
0
11 Jun 2025
Towards Efficient Multi-LLM Inference: Characterization and Analysis of LLM Routing and Hierarchical Techniques
Adarsh Prasad Behera
J. Champati
Roberto Morabito
Sasu Tarkoma
J. Gross
21
0
0
06 Jun 2025
semantic-features: A User-Friendly Tool for Studying Contextual Word Embeddings in Interpretable Semantic Spaces
Jwalanthi Ranganathan
Rohan Jha
Kanishka Misra
Kyle Mahowald
42
0
0
06 Jun 2025
SoK: Are Watermarks in LLMs Ready for Deployment?
Kieu Dang
Phung Lai
Nhathai Phan
Yelong Shen
Ruoming Jin
Abdallah Khreishah
My T. Thai
27
0
0
05 Jun 2025
Training-free AI for Earth Observation Change Detection using Physics Aware Neuromorphic Networks
Stephen Smith
Cormac Purcell
Zdenka Kuncic
25
0
0
04 Jun 2025
MCFNet: A Multimodal Collaborative Fusion Network for Fine-Grained Semantic Classification
Yang Qiao
Xiaoyu Zhong
Xiaofeng Gu
Zhiguo Yu
70
0
0
29 May 2025
Improving QA Efficiency with DistilBERT: Fine-Tuning and Inference on mobile Intel CPUs
Ngeyen Yinkfu
7
0
0
28 May 2025
VeriTrail: Closed-Domain Hallucination Detection with Traceability
Dasha Metropolitansky
Jonathan Larson
HILM
56
0
0
27 May 2025
Unfolding A Few Structures for The Many: Memory-Efficient Compression of Conformer and Speech Foundation Models
Zhaoqing Li
Haoning Xu
Xurong Xie
Zengrui Jin
Tianzi Wang
Xunying Liu
30
0
0
27 May 2025
ResSVD: Residual Compensated SVD for Large Language Model Compression
Haolei Bai
Siyong Jian
Tuo Liang
Yu Yin
Huan Wang
46
0
0
26 May 2025
Recurrent Self-Attention Dynamics: An Energy-Agnostic Perspective from Jacobians
Akiyoshi Tomihari
Ryo Karakida
46
0
0
26 May 2025
Discrete Markov Bridge
Hengli Li
Yuxuan Wang
Song-Chun Zhu
Ying Nian Wu
Zilong Zheng
DiffM
69
0
0
26 May 2025
Reasoning Beyond Language: A Comprehensive Survey on Latent Chain-of-Thought Reasoning
Xinghao Chen
Anhao Zhao
Heming Xia
Xuan Lu
Hanlin Wang
Yanjun Chen
Wei Zhang
Jian Wang
W. Li
Xiaoyu Shen
ReLM
LRM
81
0
0
22 May 2025
FS-DAG: Few Shot Domain Adapting Graph Networks for Visually Rich Document Understanding
Amit Agarwal
Srikant Panda
Kulbhushan Pachauri
57
4
0
22 May 2025
Leveraging Large Language Models for Command Injection Vulnerability Analysis in Python: An Empirical Study on Popular Open-Source Projects
Yuxuan Wang
Jingshu Chen
Qingyang Wang
ELM
53
0
0
21 May 2025
SDLog: A Deep Learning Framework for Detecting Sensitive Information in Software Logs
Roozbeh Aghili
Xingfang Wu
Foutse Khomh
Heng Li
111
0
0
20 May 2025
Large Language Models and Their Applications in Roadway Safety and Mobility Enhancement: A Comprehensive Review
Muhammad Monjurul Karim
Yan Shi
Shucheng Zhang
Bingzhang Wang
Mehrdad Nasri
Yinhai Wang
23
0
0
19 May 2025
Self-Supervised Learning for Image Segmentation: A Comprehensive Survey
Thangarajah Akilan
Nusrat Jahan
Wandong Zhang
SSL
104
0
0
19 May 2025
Class Distillation with Mahalanobis Contrast: An Efficient Training Paradigm for Pragmatic Language Understanding Tasks
Chenlu Wang
Weimin Lyu
Ritwik Banerjee
68
0
0
17 May 2025
On Membership Inference Attacks in Knowledge Distillation
Ziyao Cui
Minxing Zhang
Jian Pei
73
0
0
17 May 2025
Parallel Scaling Law for Language Models
Mouxiang Chen
Binyuan Hui
Zeyu Cui
Jiaxi Yang
Dayiheng Liu
Jianling Sun
Junyang Lin
Zhongxin Liu
MoE
LRM
91
2
0
15 May 2025
AI Greenferencing: Routing AI Inferencing to Green Modular Data Centers with Heron
Tella Rajashekhar Reddy
Palak
Rohan Gandhi
Anjaly Parayil
Chaojie Zhang
...
Liangcheng Yu
Jayashree Mohan
Srinivasan Iyengar
Shivkumar Kalyanaraman
Debopam Bhattacherjee
91
0
0
15 May 2025
Structural-Temporal Coupling Anomaly Detection with Dynamic Graph Transformer
Chang Zong
Yueting Zhuang
Jian Shao
Weiming Lu
87
0
0
13 May 2025
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
Hajar Sakai
Sarah Lam
VLM
113
0
0
12 May 2025
A Survey on Collaborative Mechanisms Between Large and Small Language Models
Yi Chen
JiaHao Zhao
HaoHao Han
98
1
0
12 May 2025
Learn to Think: Bootstrapping LLM Reasoning Capability Through Graph Representation Learning
Hang Gao
Chenhao Zhang
Tie Wang
Junsuo Zhao
Fengge Wu
Changwen Zheng
Huaping Liu
LRM
194
0
0
09 May 2025
Prediction-powered estimators for finite population statistics in highly imbalanced textual data: Public hate crime estimation
Hannes Waldetoft
Jakob Torgander
Måns Magnusson
59
1
0
05 May 2025
Parameter-Efficient Transformer Embeddings
Henry Ndubuaku
Mouad Talhi
99
0
0
04 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
111
0
0
01 May 2025
MatMMFuse: Multi-Modal Fusion model for Material Property Prediction
Abhiroop Bhattacharya
Sylvain G. Cloutier
AI4CE
55
0
0
30 Apr 2025
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
Junxuan Zhang
Jiadong Wang
Haoyang Li
Lidan Shou
Ke Chen
Gang Chen
Qin Xie
Guiming Xie
Xuejian Gong
44
0
0
24 Apr 2025
MOOSComp: Improving Lightweight Long-Context Compressor via Mitigating Over-Smoothing and Incorporating Outlier Scores
Fengwei Zhou
Jiafei Song
Wenjin Jason Li
Gengjian Xue
Zhikang Zhao
Yichao Lu
Bailin Na
63
1
0
23 Apr 2025
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
Enes Özeren
Yihong Liu
Hinrich Schütze
67
0
0
21 Apr 2025
Quantitative Clustering in Mean-Field Transformer Models
Shi Chen
Zhengjiang Lin
Yury Polyanskiy
Philippe Rigollet
129
2
0
20 Apr 2025
Q-FAKER: Query-free Hard Black-box Attack via Controlled Generation
CheolWon Na
YunSeok Choi
Jee-Hyong Lee
AAML
71
0
0
18 Apr 2025
WildFireCan-MMD: A Multimodal Dataset for Classification of User-Generated Content During Wildfires in Canada
Braeden Sherritt
Isar Nejadgholi
Marzieh Amini
VLM
150
0
0
17 Apr 2025
Out of Sight Out of Mind, Out of Sight Out of Mind: Measuring Bias in Language Models Against Overlooked Marginalized Groups in Regional Contexts
Fatma Elsafoury
David Hartmann
73
0
0
17 Apr 2025
A new training approach for text classification in Mental Health: LatentGLoss
Korhan Sevinç
AI4MH
35
0
0
09 Apr 2025
Exploring Gradient-Guided Masked Language Model to Detect Textual Adversarial Attacks
Xiaomei Zhang
Zhaoxi Zhang
Yanjun Zhang
Xufei Zheng
L. Zhang
Shengshan Hu
Shirui Pan
AAML
58
0
0
08 Apr 2025
Pyramid-based Mamba Multi-class Unsupervised Anomaly Detection
Nasar Iqbal
Niki Martinel
Mamba
76
1
0
04 Apr 2025
Detecting Stereotypes and Anti-stereotypes the Correct Way Using Social Psychological Underpinnings
Kaustubh Shivshankar Shejole
Pushpak Bhattacharyya
47
0
0
04 Apr 2025
Multimodal Fusion and Vision-Language Models: A Survey for Robot Vision
Xiaofeng Han
Shunpeng Chen
Zenghuang Fu
Zhe Feng
Lue Fan
...
Li Guo
Weiliang Meng
Xiaopeng Zhang
Rongtao Xu
Shibiao Xu
122
4
0
03 Apr 2025
Advancing Semantic Caching for LLMs with Domain-Specific Embeddings and Synthetic Data
Waris Gill
Justin Cechmanek
Tyler Hutcherson
Srijith Rajamohan
Jen Agarwal
Muhammad Ali Gulzar
Manvinder Singh
Benoit Dion
65
1
0
03 Apr 2025
From Text to Graph: Leveraging Graph Neural Networks for Enhanced Explainability in NLP
Fabio Yáñez-Romero
Andrés Montoyo
Armando Suárez
Yoan Gutiérrez
Ruslan Mitkov
100
0
0
02 Apr 2025
KernelDNA: Dynamic Kernel Sharing via Decoupled Naive Adapters
Haiduo Huang
Yadong Zhang
Pengju Ren
117
0
0
30 Mar 2025
Evaluating Text-to-Image Synthesis with a Conditional Fréchet Distance
Jaywon Koo
J. Hernandez
Moayed Haji-Ali
Ziyan Yang
Vicente Ordonez
EGVM
117
0
0
27 Mar 2025
Cyborg Data: Merging Human with AI Generated Training Data
Kai North
Christopher Ormerod
68
0
0
26 Mar 2025
1
2
3
4
...
57
58
59
Next