Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1910.01108
Cited By
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
2 October 2019
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
Re-assign community
ArXiv
PDF
HTML
Papers citing
"DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter"
50 / 131 papers shown
Title
Co-AttenDWG: Co-Attentive Dimension-Wise Gating and Expert Fusion for Multi-Modal Offensive Content Detection
Md. Mithun Hossain
Md. Shakil Hossain
Sudipto Chaki
M. F. Mridha
78
0
0
25 May 2025
Discretization-free Multicalibration through Loss Minimization over Tree Ensembles
Hongyi Henry Jin
Zijun Ding
Dung Daniel Ngo
Zhiwei Steven Wu
49
0
0
23 May 2025
Locality-Sensitive Hashing for Efficient Hard Negative Sampling in Contrastive Learning
Fabian Deuser
Philipp Hausenblas
Hannah Schieber
Daniel Roth
Martin Werner
Norbert Oswald
68
0
0
23 May 2025
On Multilingual Encoder Language Model Compression for Low-Resource Languages
Daniil Gurgurov
Michal Gregor
Josef van Genabith
Simon Ostermann
72
0
0
22 May 2025
MoL for LLMs: Dual-Loss Optimization to Enhance Domain Expertise While Preserving General Capabilities
Jingxue Chen
Qingkun Tang
Qianchun Lu
Siyuan Fang
43
0
0
17 May 2025
ADALog: Adaptive Unsupervised Anomaly detection in Logs with Self-attention Masked Language Model
Przemek Pospieszny
Wojciech Mormul
Karolina Szyndler
Sanjeev Kumar
56
0
0
15 May 2025
KDH-MLTC: Knowledge Distillation for Healthcare Multi-Label Text Classification
Hajar Sakai
Sarah Lam
VLM
63
0
0
12 May 2025
Semantic Retention and Extreme Compression in LLMs: Can We Have Both?
Stanislas Laborde
Martin Cousseau
Antoun Yaacoub
Lionel Prevost
MQ
46
0
0
12 May 2025
AI-Enabled Accurate Non-Invasive Assessment of Pulmonary Hypertension Progression via Multi-Modal Echocardiography
Jiewen Yang
Taoran Huang
Shangwei Ding
Xiaowei Xu
Qinhua Zhao
...
Bin Pu
Jiexuan Zheng
Caojin Zhang
Hongwen Fei
Xuelong Li
41
0
0
12 May 2025
Replay-Based Continual Learning with Dual-Layered Distillation and a Streamlined U-Net for Efficient Text-to-Image Generation
Md. Naimur Asif Borno
Md Sakib Hossain Shovon
Asmaa Soliman Al-Moisheer
Mohammad Ali Moni
57
0
0
11 May 2025
Cape: Context-Aware Prompt Perturbation Mechanism with Differential Privacy
Haoqi Wu
Wei Dai
Li Wang
Qiang Yan
SILM
64
1
0
09 May 2025
Revisiting the MIMIC-IV Benchmark: Experiments Using Language Models for Electronic Health Records
Jesus Lovon
Thouria Ben-Haddi
Jules Di Scala
José G. Moreno
L. Tamine
89
2
0
29 Apr 2025
Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions
Chang Zong
Bin Li
Shoujun Zhou
Jian Wan
Lei Zhang
332
0
0
22 Apr 2025
Empirical Evaluation of Knowledge Distillation from Transformers to Subquadratic Language Models
Patrick Haller
Jonas Golde
Alan Akbik
53
0
0
19 Apr 2025
From Large to Super-Tiny: End-to-End Optimization for Cost-Efficient LLMs
Jiliang Ni
Jiachen Pu
Zhongyi Yang
Kun Zhou
Hui Wang
Xiaoliang Xiao
Dakui Wang
Xin Li
Jingfeng Luo
Conggang Hu
54
0
0
18 Apr 2025
Prompt Optimization with Logged Bandit Data
Haruka Kiyohara
Daniel Yiming Cao
Yuta Saito
Thorsten Joachims
119
0
0
03 Apr 2025
Catch Me if You Search: When Contextual Web Search Results Affect the Detection of Hallucinations
Mahjabin Nahar
Eun-Ju Lee
Jin Won Park
Dongwon Lee
HILM
96
0
0
01 Apr 2025
Improving User Behavior Prediction: Leveraging Annotator Metadata in Supervised Machine Learning Models
Lynnette Ng
Kokil Jaidka
Kaiyuan Tay
Hansin Ahuja
Niyati Chhaya
87
0
0
26 Mar 2025
A Generalist Hanabi Agent
Arjun Vaithilingam Sudhakar
Hadi Nekoei
Mathieu Reymond
Miao Liu
Janarthanan Rajendran
Sarath Chandar
358
0
0
17 Mar 2025
GraphEval: A Lightweight Graph-Based LLM Framework for Idea Evaluation
Tao Feng
Yihang Sun
Jiaxuan You
98
1
0
16 Mar 2025
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi
Eddy Ilg
Margret Keuper
Hideki Tanaka
Masao Utiyama
Raj Dabre
Steffen Eger
Simone Paolo Ponzetto
81
0
0
14 Mar 2025
Training Plug-n-Play Knowledge Modules with Deep Context Distillation
Lucas Caccia
Alan Ansell
Edoardo Ponti
Ivan Vulić
Alessandro Sordoni
SyDa
391
0
0
11 Mar 2025
Is My Text in Your AI Model? Gradient-based Membership Inference Test applied to LLMs
Gonzalo Mancera
Daniel DeAlcala
Julian Fierrez
Ruben Tolosana
Aythami Morales
56
1
0
10 Mar 2025
Development and Enhancement of Text-to-Image Diffusion Models
Rajdeep Roshan Sahu
VLM
87
36
0
07 Mar 2025
The Society of HiveMind: Multi-Agent Optimization of Foundation Model Swarms to Unlock the Potential of Collective Intelligence
Noah Mamie
Susie Xi Rao
LLMAG
AI4CE
83
0
0
07 Mar 2025
Encryption-Friendly LLM Architecture
Donghwan Rho
Taeseong Kim
Minje Park
Jung Woo Kim
Hyunsik Chae
Jung Hee Cheon
Ernest K. Ryu
103
2
0
24 Feb 2025
Fully automatic extraction of morphological traits from the Web: utopia or reality?
Diego Marcos
Robert van de Vlasakker
Ioannis Athanasiadis
P. Bonnet
Hervé Goëau
Alexis Joly
W. Daniel Kissling
César Leblanc
André S. J. van Proosdij
Konstantinos P. Panousis
95
3
0
24 Feb 2025
A Survey of Model Architectures in Information Retrieval
Zhichao Xu
Fengran Mo
Zhiqi Huang
Crystina Zhang
Puxuan Yu
Bei Wang
Jimmy J. Lin
Vivek Srikumar
KELM
3DV
94
2
0
21 Feb 2025
SPEX: Scaling Feature Interaction Explanations for LLMs
Justin Singh Kang
Landon Butler
Abhineet Agarwal
Yigit Efe Erginbas
Ramtin Pedarsani
Kannan Ramchandran
Bin Yu
VLM
LRM
102
2
0
20 Feb 2025
Prompt-based Depth Pruning of Large Language Models
Juyun Wee
Minjae Park
Jaeho Lee
VLM
118
0
0
17 Feb 2025
RideKE: Leveraging Low-Resource, User-Generated Twitter Content for Sentiment and Emotion Detection in Kenyan Code-Switched Dataset
Naome A. Etori
Maria Gini
118
2
0
10 Feb 2025
SMAB: MAB based word Sensitivity Estimation Framework and its Applications in Adversarial Text Generation
Saurabh Kumar Pandey
S. Vashistha
Debrup Das
Somak Aditya
Monojit Choudhury
AAML
95
0
0
10 Feb 2025
Few-shot LLM Synthetic Data with Distribution Matching
Jiyuan Ren
Zhaocheng Du
Zhihao Wen
Qinglin Jia
Sunhao Dai
Chuhan Wu
Zhenhua Dong
SyDa
112
0
0
09 Feb 2025
DexVLA: Vision-Language Model with Plug-In Diffusion Expert for General Robot Control
Junjie Wen
Yinlin Zhu
Jinming Li
Zhibin Tang
Yaxin Peng
Feifei Feng
VLM
72
18
0
09 Feb 2025
Towards Cross-Tokenizer Distillation: the Universal Logit Distillation Loss for LLMs
Nicolas Boizard
Kevin El Haddad
C´eline Hudelot
Pierre Colombo
100
15
0
28 Jan 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
93
0
0
28 Jan 2025
The Effect of Similarity Measures on Accurate Stability Estimates for Local Surrogate Models in Text-based Explainable AI
Christopher Burger
Charles Walter
Thai Le
AAML
157
1
0
20 Jan 2025
AIMA at SemEval-2024 Task 3: Simple Yet Powerful Emotion Cause Pair Analysis
Alireza Ghahramani Kure
Mahshid Dehghani
Mohammad Mahdi Abootorabi
Nona Ghazizadeh
Seyed Arshan Dalili
Ehsaneddin Asgari
68
1
0
19 Jan 2025
A Comprehensive Survey of Foundation Models in Medicine
Wasif Khan
Seowung Leem
Kyle B. See
Joshua K. Wong
Shaoting Zhang
R. Fang
AI4CE
LM&MA
VLM
158
21
0
17 Jan 2025
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning
Rajath Rao
Adithya Ganesan
Oscar Kjell
Jonah Luby
Akshay Raghavan
...
B. Luft
Camilo Ruggero
Neville Ryant
R. Kotov
H. Andrew Schwartz
62
0
0
15 Jan 2025
Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers
Tobias Leemann
Alina Fastowski
Felix Pfeiffer
Gjergji Kasneci
92
5
0
10 Jan 2025
LLM Reasoning Engine: Specialized Training for Enhanced Mathematical Reasoning
Shuguang Chen
Guang Lin
LRM
346
0
0
28 Dec 2024
Unifying Feature-Based Explanations with Functional ANOVA and Cooperative Game Theory
Fabian Fumagalli
Maximilian Muschalik
Eyke Hüllermeier
Barbara Hammer
J. Herbinger
FAtt
92
2
0
22 Dec 2024
FedRLHF: A Convergence-Guaranteed Federated Framework for Privacy-Preserving and Personalized RLHF
Flint Xiaofeng Fan
Cheston Tan
Yew-Soon Ong
Roger Wattenhofer
Wei Tsang Ooi
96
1
0
20 Dec 2024
Perception of Visual Content: Differences Between Humans and Foundation Models
Nardiena A. Pratama
Shaoyang Fan
Gianluca Demartini
VLM
114
0
0
28 Nov 2024
Multi-Label Bayesian Active Learning with Inter-Label Relationships
Yuanyuan Qi
Jueqing Lu
Xiaohao Yang
Joanne Enticott
Lan Du
116
0
0
26 Nov 2024
LAGUNA: LAnguage Guided UNsupervised Adaptation with structured spaces
Anxhelo Diko
Antonino Furnari
Luigi Cinque
G. Farinella
236
0
0
23 Nov 2024
KinMo: Kinematic-aware Human Motion Understanding and Generation
Pengfei Zhang
Pinxin Liu
Hyeongwoo Kim
Pablo Garrido
Bindita Chaudhuri
118
2
0
23 Nov 2024
FuseGPT: Learnable Layers Fusion of Generative Pre-trained Transformers
Zehua Pei
Hui-Ling Zhen
Xianzhi Yu
Sinno Jialin Pan
Mingxuan Yuan
Bei Yu
AI4CE
134
3
0
21 Nov 2024
CorrSynth -- A Correlated Sampling Method for Diverse Dataset Generation from LLMs
Suhas S Kowshik
Abhishek Divekar
Vijit Malik
SyDa
80
0
0
13 Nov 2024
1
2
3
Next