ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1910.01108
  4. Cited By
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

2 October 2019
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
ArXivPDFHTML

Papers citing "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter"

50 / 131 papers shown
Title
ViTOC: Vision Transformer and Object-aware Captioner
ViTOC: Vision Transformer and Object-aware Captioner
Feiyang Huang
63
0
0
09 Nov 2024
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu
Xinyan Velocity Yu
Dani Yogatama
Jiasen Lu
Yoon Kim
AIFin
68
17
0
07 Nov 2024
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Navigating Extremes: Dynamic Sparsity in Large Output Spaces
Nasib Ullah
Erik Schultheis
Mike Lasby
Yani Andrew Ioannou
Rohit Babbar
52
0
0
05 Nov 2024
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Generative AI-Powered Plugin for Robust Federated Learning in Heterogeneous IoT Networks
Youngjoon Lee
J. Gong
Joonhyuk Kang
70
0
0
31 Oct 2024
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
InjecGuard: Benchmarking and Mitigating Over-defense in Prompt Injection Guardrail Models
Haoyang Li
Xiaogeng Liu
SILM
64
5
0
30 Oct 2024
Vulnerability of LLMs to Vertically Aligned Text Manipulations
Vulnerability of LLMs to Vertically Aligned Text Manipulations
Zhecheng Li
Yijiao Wang
Bryan Hooi
Yujun Cai
Zhen Xiong
Nanyun Peng
Kai-Wei Chang
104
1
0
26 Oct 2024
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
SWITCH: Studying with Teacher for Knowledge Distillation of Large Language Models
Jahyun Koo
Yerin Hwang
Yongil Kim
Taegwan Kang
Hyunkyung Bae
Kyomin Jung
86
0
0
25 Oct 2024
Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges
Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges
Farid Ariai
Gianluca Demartini
ELM
AILaw
VLM
55
4
0
25 Oct 2024
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs
Is the GPU Half-Empty or Half-Full? Practical Scheduling Techniques for LLMs
Ferdi Kossmann
Bruce Fontaine
Daya Khudia
Michael Cafarella
Samuel Madden
215
2
0
23 Oct 2024
MiniPLM: Knowledge Distillation for Pre-Training Language Models
MiniPLM: Knowledge Distillation for Pre-Training Language Models
Yuxian Gu
Hao Zhou
Fandong Meng
Jie Zhou
Minlie Huang
122
5
0
22 Oct 2024
A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
A Novel Characterization of the Population Area Under the Risk Coverage Curve (AURC) and Rates of Finite Sample Estimators
Han Zhou
Jordy Van Landeghem
Teodora Popordanoska
Matthew B. Blaschko
65
2
0
20 Oct 2024
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Speculative Knowledge Distillation: Bridging the Teacher-Student Gap Through Interleaved Sampling
Wenyuan Xu
Rujun Han
Zhenting Wang
L. Le
Dhruv Madeka
Lei Li
Wenjie Wang
Rishabh Agarwal
Chen-Yu Lee
Tomas Pfister
100
9
0
15 Oct 2024
Locality Alignment Improves Vision-Language Models
Locality Alignment Improves Vision-Language Models
Ian Covert
Tony Sun
James Zou
Tatsunori Hashimoto
VLM
145
5
0
14 Oct 2024
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Towards Synergistic, Generalized, and Efficient Dual-System for Robotic Manipulation
Qingwen Bu
Hongyang Li
Li Chen
Jisong Cai
Jia Zeng
Heming Cui
Maoqing Yao
Yu Qiao
78
6
0
10 Oct 2024
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Joint Fine-tuning and Conversion of Pretrained Speech and Language Models towards Linear Complexity
Mutian He
Philip N. Garner
124
0
0
09 Oct 2024
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
GLOV: Guided Large Language Models as Implicit Optimizers for Vision Language Models
Muhammad Jehanzeb Mirza
Mengjie Zhao
Zhuoyuan Mao
Sivan Doveh
Wei Lin
...
Yuki Mitsufuji
Horst Possegger
Rogerio Feris
Leonid Karlinsky
James Glass
VLM
126
1
0
08 Oct 2024
Efficient Inference for Large Language Model-based Generative Recommendation
Efficient Inference for Large Language Model-based Generative Recommendation
Xinyu Lin
Chaoqun Yang
Wenjie Wang
Yongqi Li
Cunxiao Du
Fuli Feng
See-Kiong Ng
Tat-Seng Chua
93
4
0
07 Oct 2024
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Dolphin: A Programmable Framework for Scalable Neurosymbolic Learning
Aaditya Naik
Jason Liu
Claire Wang
Amish Sethi
Saikat Dutta
Mayur Naik
Eric Wong
58
2
0
04 Oct 2024
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Discrete Policy: Learning Disentangled Action Space for Multi-Task Robotic Manipulation
Kun Wu
Yichen Zhu
Jinming Li
Junjie Wen
Ning Liu
Zhiyuan Xu
Qinru Qiu
98
6
0
27 Sep 2024
Sample Compression Unleashed: New Generalization Bounds for Real Valued Losses
Sample Compression Unleashed: New Generalization Bounds for Real Valued Losses
Mathieu Bazinet
Valentina Zantedeschi
Pascal Germain
MLT
AI4CE
52
2
0
26 Sep 2024
One missing piece in Vision and Language: A Survey on Comics Understanding
One missing piece in Vision and Language: A Survey on Comics Understanding
Emanuele Vivoli
Andrey Barsky
Mohamed Ali Souibgui
Artemis LLabres
Marco Bertini
Dimosthenis Karatzas
68
4
0
14 Sep 2024
What is the Role of Small Models in the LLM Era: A Survey
What is the Role of Small Models in the LLM Era: A Survey
Lihu Chen
Gaël Varoquaux
ALM
140
26
0
10 Sep 2024
On The Role of Prompt Construction In Enhancing Efficacy and Efficiency of LLM-Based Tabular Data Generation
On The Role of Prompt Construction In Enhancing Efficacy and Efficiency of LLM-Based Tabular Data Generation
Banooqa H. Banday
Kowshik Thopalli
Tanzima Z. Islam
Jayaraman J. Thiagarajan
88
0
0
06 Sep 2024
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture
DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture
Qianlong Xiang
Miao Zhang
Yuzhang Shang
Jianlong Wu
Yan Yan
Liqiang Nie
DiffM
87
10
0
05 Sep 2024
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attribution Explainability
Joakim Edin
Andreas Geert Motzfeldt
Casper L. Christensen
Tuukka Ruotsalo
Lars Maaløe
Maria Maistro
92
4
0
15 Aug 2024
The advantages of context specific language models: the case of the Erasmian Language Model
The advantages of context specific language models: the case of the Erasmian Language Model
João Gonçalves
Nick Jelicic
Michele Murgia
Evert Stamhuis
58
0
0
13 Aug 2024
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
LLM-DetectAIve: a Tool for Fine-Grained Machine-Generated Text Detection
Mervat Abassy
Kareem Elozeiri
Alexander Aziz
Minh Ngoc Ta
Raj Vardhan Tomar
...
Alham Fikri Aji
Artem Shelmanov
Nizar Habash
Iryna Gurevych
Preslav Nakov
DeLMO
80
17
0
08 Aug 2024
Private Collaborative Edge Inference via Over-the-Air Computation
Private Collaborative Edge Inference via Over-the-Air Computation
Selim F. Yilmaz
Burak Hasircioglu
Li Qiao
Deniz Gunduz
FedML
99
1
0
30 Jul 2024
Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge Distillation
Overcoming Uncertain Incompleteness for Robust Multimodal Sequential Diagnosis Prediction via Curriculum Data Erasing Guided Knowledge Distillation
Heejoon Koo
83
0
0
28 Jul 2024
FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios
FsPONER: Few-shot Prompt Optimization for Named Entity Recognition in Domain-specific Scenarios
Yongjian Tang
Rakebul Hasan
Thomas Runkler
104
2
0
10 Jul 2024
Direct Preference Knowledge Distillation for Large Language Models
Direct Preference Knowledge Distillation for Large Language Models
Yixing Li
Yuxian Gu
Li Dong
Dequan Wang
Yu Cheng
Furu Wei
67
6
0
28 Jun 2024
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
Recite, Reconstruct, Recollect: Memorization in LMs as a Multifaceted Phenomenon
USVSN Sai Prashanth
Alvin Deng
Kyle O'Brien
Jyothir S V
Mohammad Aflah Khan
...
Jacob Ray Fuehne
Stella Biderman
Tracy Ke
Katherine Lee
Naomi Saphra
106
12
0
25 Jun 2024
A Syntax-Injected Approach for Faster and More Accurate Sentiment Analysis
A Syntax-Injected Approach for Faster and More Accurate Sentiment Analysis
Muhammad Imran
Olga Kellert
Carlos Gómez-Rodríguez
31
1
0
21 Jun 2024
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation
Jiaming Zhou
Teli Ma
Kun-Yu Lin
Ronghe Qiu
Zifan Wang
Junwei Liang
88
7
0
20 Jun 2024
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
SysCaps: Language Interfaces for Simulation Surrogates of Complex Systems
Patrick Emami
Zhaonan Li
Saumya Sinha
Truc Nguyen
101
1
0
30 May 2024
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates
An Empirical Analysis of Forgetting in Pre-trained Models with Incremental Low-Rank Updates
Albin Soutif--Cormerais
Simone Magistri
Joost van de Weijer
Andew D. Bagdanov
57
1
0
28 May 2024
SoK: Leveraging Transformers for Malware Analysis
SoK: Leveraging Transformers for Malware Analysis
Pradip Kunwar
Kshitiz Aryal
Maanak Gupta
Mahmoud Abdelsalam
Elisa Bertino
103
0
0
27 May 2024
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Tokenization Matters! Degrading Large Language Models through Challenging Their Tokenization
Dixuan Wang
Yanda Li
Junyuan Jiang
Zepeng Ding
Ziqin Luo
Guochao Jiang
Jiaqing Liang
Deqing Yang
58
13
0
27 May 2024
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models
Abdelrahman Abdelhamed
Mahmoud Afifi
Alec Go
MLLM
VLM
75
3
0
24 May 2024
Full Line Code Completion: Bringing AI to Desktop
Full Line Code Completion: Bringing AI to Desktop
Anton Semenkin
Vitaliy Bibaev
Yaroslav Sokolov
Kirill Krylov
Alexey Kalina
...
Mikhail Podvitskii
Petr Surkov
Yaroslav Golubev
Nikita Povarov
T. Bryksin
61
2
0
14 May 2024
Large Language Models for Cyber Security: A Systematic Literature Review
Large Language Models for Cyber Security: A Systematic Literature Review
HanXiang Xu
Shenao Wang
Ningke Li
Kaidi Wang
Yanjie Zhao
Kai Chen
Ting Yu
Yang Liu
Haoyu Wang
76
33
0
08 May 2024
LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models
LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models
Dingkun Zhang
Sijia Li
Chen Chen
Qingsong Xie
H. Lu
56
25
0
17 Apr 2024
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Unified Static and Dynamic Network: Efficient Temporal Filtering for Video Grounding
Jingjing Hu
Dan Guo
Kun Li
Zhan Si
Xun Yang
Xiaojun Chang
Meng Wang
82
3
0
21 Mar 2024
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
DROID: A Large-Scale In-The-Wild Robot Manipulation Dataset
Alexander Khazatsky
Karl Pertsch
Suraj Nair
Ashwin Balakrishna
Sudeep Dasari
...
Thomas Kollar
Sergey Levine
Chelsea Finn
Sergey Levine
Chelsea Finn
121
197
0
19 Mar 2024
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality
Rahul Zalkikar
Kanchan Chandra
74
1
0
21 Feb 2024
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
Chain-of-Instructions: Compositional Instruction Tuning on Large Language Models
S. Hayati
Taehee Jung
Tristan Bodding-Long
Sudipta Kar
A. Sethy
Joo-Kyung Kim
Dongyeop Kang
ALM
LRM
78
7
0
18 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
54
29
0
08 Feb 2024
Large Language Model Agent for Hyper-Parameter Optimization
Large Language Model Agent for Hyper-Parameter Optimization
Siyi Liu
Chen Gao
Yong Li
78
21
0
02 Feb 2024
Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending
Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending
Mario Sanz-Guerrero
Javier Arroyo
55
5
0
29 Jan 2024
FREE: The Foundational Semantic Recognition for Modeling Environmental Ecosystems
FREE: The Foundational Semantic Recognition for Modeling Environmental Ecosystems
Shiyuan Luo
Juntong Ni
Shengyu Chen
Runlong Yu
Yiqun Xie
Licheng Liu
Zhenong Jin
Huaxiu Yao
Xiaowei Jia
67
8
0
17 Nov 2023
Previous
123
Next