ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2004.02178
  4. Cited By
FastBERT: a Self-distilling BERT with Adaptive Inference Time

FastBERT: a Self-distilling BERT with Adaptive Inference Time

5 April 2020
Weijie Liu
Peng Zhou
Zhe Zhao
Zhiruo Wang
Haotang Deng
Qi Ju
ArXivPDFHTML

Papers citing "FastBERT: a Self-distilling BERT with Adaptive Inference Time"

50 / 63 papers shown
Title
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
56
0
0
29 Apr 2025
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Divya J. Bajpai
M. Hanawal
76
0
0
02 Feb 2025
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
43
0
0
01 Nov 2024
Accelerating Large Language Model Inference with Self-Supervised Early
  Exits
Accelerating Large Language Model Inference with Self-Supervised Early Exits
Florian Valade
LRM
44
1
0
30 Jul 2024
Tiny Models are the Computational Saver for Large Models
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
44
2
0
26 Mar 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
40
3
0
25 Mar 2024
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on
  Prototypical Networks
DE3^33-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
38
3
0
03 Feb 2024
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
  Models with 3D Parallelism
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
41
31
0
08 Dec 2023
AutoMix: Automatically Mixing Language Models
AutoMix: Automatically Mixing Language Models
Pranjal Aggarwal
Aman Madaan
Ankit Anand
Srividya Pranavi Potharaju
Swaroop Mishra
...
Karthik Kappaganthu
Yiming Yang
Shyam Upadhyay
Manaal Faruqui
Mausam
42
17
0
19 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
Anna Langedijk
Hosein Mohebbi
Gabriele Sarti
Willem H. Zuidema
Jaap Jumelet
32
10
0
05 Oct 2023
An Ensemble Approach to Question Classification: Integrating Electra
  Transformer, GloVe, and LSTM
An Ensemble Approach to Question Classification: Integrating Electra Transformer, GloVe, and LSTM
Sanad Aburass
O. Dorgham
Maha Abu Rumman
27
3
0
13 Aug 2023
Vesper: A Compact and Effective Pretrained Model for Speech Emotion
  Recognition
Vesper: A Compact and Effective Pretrained Model for Speech Emotion Recognition
Weidong Chen
Xiaofen Xing
Peihao Chen
Xiangmin Xu
VLM
30
35
0
20 Jul 2023
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
Qingqing Cao
Bhargavi Paranjape
Hannaneh Hajishirzi
MLLM
VLM
13
21
0
27 May 2023
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
  Graph in Pre-Trained Transformers
Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention Graph in Pre-Trained Transformers
Hongjie Wang
Bhishma Dedhia
N. Jha
ViT
VLM
44
26
0
27 May 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and
  Multi-label text Classification Tasks
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
26
12
0
21 May 2023
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning
  and Coding with LLMs
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs
Pranjal Aggarwal
Aman Madaan
Yiming Yang
Mausam
LRM
33
38
0
19 May 2023
MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
MoT: Memory-of-Thought Enables ChatGPT to Self-Improve
Xiaonan Li
Xipeng Qiu
ReLM
KELM
LRM
AI4MH
26
32
0
09 May 2023
A Comprehensive Survey of AI-Generated Content (AIGC): A History of
  Generative AI from GAN to ChatGPT
A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT
Yihan Cao
Siyu Li
Yixin Liu
Zhiling Yan
Yutong Dai
Philip S. Yu
Lichao Sun
35
508
0
07 Mar 2023
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for
  Click-Through Rate Prediction
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction
Yachen Yan
Liubo Li
16
3
0
06 Jan 2023
Utilizing distilBert transformer model for sentiment classification of
  COVID-19's Persian open-text responses
Utilizing distilBert transformer model for sentiment classification of COVID-19's Persian open-text responses
F. Masoumi
M. Bahrani
19
2
0
16 Dec 2022
Gradient-based Intra-attention Pruning on Pre-trained Language Models
Gradient-based Intra-attention Pruning on Pre-trained Language Models
Ziqing Yang
Yiming Cui
Xin Yao
Shijin Wang
VLM
37
8
0
15 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
26
2
0
06 Dec 2022
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Momentum Decoding: Open-ended Text Generation As Graph Exploration
Tian Lan
Yixuan Su
Shuhang Liu
Heyan Huang
Xian-Ling Mao
47
5
0
05 Dec 2022
Understanding the Robustness of Multi-Exit Models under Common
  Corruptions
Understanding the Robustness of Multi-Exit Models under Common Corruptions
Akshay Mehra
Skyler Seto
Navdeep Jaitly
B. Theobald
AAML
16
3
0
03 Dec 2022
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating
  Unified Vision Language Model
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Sheng Tang
Yaqing Wang
Zhenglun Kong
Tianchi Zhang
Yao Li
Caiwen Ding
Yanzhi Wang
Yi Liang
Dongkuan Xu
33
31
0
21 Nov 2022
Compressing Transformer-based self-supervised models for speech
  processing
Compressing Transformer-based self-supervised models for speech processing
Tzu-Quan Lin
Tsung-Huan Yang
Chun-Yao Chang
Kuang-Ming Chen
Tzu-hsun Feng
Hung-yi Lee
Hao Tang
40
6
0
17 Nov 2022
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight
  BERT
Fast and Accurate FSA System Using ELBERT: An Efficient and Lightweight BERT
Siyuan Lu
Chenchen Zhou
Keli Xie
Jun Lin
Zhongfeng Wang
24
1
0
16 Nov 2022
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency
  with Slenderized Multi-exit Language Models
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
Bowen Shen
Zheng Lin
Yuanxin Liu
Zhengxiao Liu
Lei Wang
Weiping Wang
VLM
47
4
0
27 Oct 2022
Efficiently Controlling Multiple Risks with Pareto Testing
Efficiently Controlling Multiple Risks with Pareto Testing
Bracha Laufer-Goldshtein
Adam Fisch
Regina Barzilay
Tommi Jaakkola
36
16
0
14 Oct 2022
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
  Networks on Edge NPUs
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Alexandros Kouris
Stylianos I. Venieris
Stefanos Laskaridis
Nicholas D. Lane
42
8
0
27 Sep 2022
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
30
109
0
31 Aug 2022
Building an Efficiency Pipeline: Commutativity and Cumulativeness of
  Efficiency Operators for Transformers
Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers
Ji Xin
Raphael Tang
Zhiying Jiang
Yaoliang Yu
Jimmy J. Lin
18
1
0
31 Jul 2022
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text
  Retrieval
HiVLP: Hierarchical Vision-Language Pre-Training for Fast Image-Text Retrieval
Feilong Chen
Xiuyi Chen
Jiaxin Shi
Duzhen Zhang
Jianlong Chang
Qi Tian
VLM
CLIP
34
6
0
24 May 2022
Certified Error Control of Candidate Set Pruning for Two-Stage Relevance
  Ranking
Certified Error Control of Candidate Set Pruning for Two-Stage Relevance Ranking
Minghan Li
Xinyu Crystina Zhang
Ji Xin
Hongyang R. Zhang
Jimmy J. Lin
38
6
0
19 May 2022
A Fast Attention Network for Joint Intent Detection and Slot Filling on
  Edge Devices
A Fast Attention Network for Joint Intent Detection and Slot Filling on Edge Devices
Liang Huang
Senjie Liang
Feiyang Ye
Nan Gao
57
4
0
16 May 2022
PALBERT: Teaching ALBERT to Ponder
PALBERT: Teaching ALBERT to Ponder
Nikita Balagansky
Daniil Gavrilov
MoE
29
6
0
07 Apr 2022
A Simple Hash-Based Early Exiting Approach For Language Understanding
  and Generation
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation
Tianxiang Sun
Xiangyang Liu
Wei-wei Zhu
Zhichao Geng
Lingling Wu
Yilong He
Yuan Ni
Guotong Xie
Xuanjing Huang
Xipeng Qiu
37
40
0
03 Mar 2022
AdaViT: Adaptive Tokens for Efficient Vision Transformer
AdaViT: Adaptive Tokens for Efficient Vision Transformer
Hongxu Yin
Arash Vahdat
J. Álvarez
Arun Mallya
Jan Kautz
Pavlo Molchanov
ViT
35
314
0
14 Dec 2021
Introspective Distillation for Robust Question Answering
Introspective Distillation for Robust Question Answering
Yulei Niu
Hanwang Zhang
27
59
0
01 Nov 2021
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline
Xiangyang Liu
Tianxiang Sun
Junliang He
Jiawen Wu
Lingling Wu
Xinyu Zhang
Hao Jiang
Bo Zhao
Xuanjing Huang
Xipeng Qiu
ELM
28
46
0
13 Oct 2021
DACT-BERT: Differentiable Adaptive Computation Time for an Efficient
  BERT Inference
DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference
Cristobal Eyzaguirre
Felipe del-Rio
Vladimir Araujo
Alvaro Soto
16
7
0
24 Sep 2021
Will this Question be Answered? Question Filtering via Answer Model
  Distillation for Efficient Question Answering
Will this Question be Answered? Question Filtering via Answer Model Distillation for Efficient Question Answering
Siddhant Garg
Alessandro Moschitti
29
26
0
14 Sep 2021
Sequential Attention Module for Natural Language Processing
Sequential Attention Module for Natural Language Processing
Mengyuan Zhou
Jian Ma
Haiqing Yang
Lian-Xin Jiang
Yang Mo
AI4TS
27
2
0
07 Sep 2021
Training Adaptive Computation for Open-Domain Question Answering with
  Computational Constraints
Training Adaptive Computation for Open-Domain Question Answering with Computational Constraints
Yuxiang Wu
Pasquale Minervini
Pontus Stenetorp
Sebastian Riedel
27
5
0
05 Jul 2021
Elbert: Fast Albert with Confidence-Window Based Early Exit
Elbert: Fast Albert with Confidence-Window Based Early Exit
Keli Xie
Siyuan Lu
Meiqi Wang
Zhongfeng Wang
14
20
0
01 Jul 2021
Deep Learning Through the Lens of Example Difficulty
Deep Learning Through the Lens of Example Difficulty
R. Baldock
Hartmut Maennel
Behnam Neyshabur
47
156
0
17 Jun 2021
Accelerating BERT Inference for Sequence Labeling via Early-Exit
Accelerating BERT Inference for Sequence Labeling via Early-Exit
Xiaonan Li
Yunfan Shao
Tianxiang Sun
Hang Yan
Xipeng Qiu
Xuanjing Huang
24
40
0
28 May 2021
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal
  Dependencies
Retraining DistilBERT for a Voice Shopping Assistant by Using Universal Dependencies
P. Jayarao
Arpit Sharma
16
2
0
29 Mar 2021
Split Computing and Early Exiting for Deep Learning Applications: Survey
  and Research Challenges
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges
Yoshitomo Matsubara
Marco Levorato
Francesco Restuccia
33
199
0
08 Mar 2021
AutoFreeze: Automatically Freezing Model Blocks to Accelerate
  Fine-tuning
AutoFreeze: Automatically Freezing Model Blocks to Accelerate Fine-tuning
Yuhan Liu
Saurabh Agarwal
Shivaram Venkataraman
OffRL
16
53
0
02 Feb 2021
12
Next