ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2006.04152
  4. Cited By
BERT Loses Patience: Fast and Robust Inference with Early Exit

BERT Loses Patience: Fast and Robust Inference with Early Exit

7 June 2020
Wangchunshu Zhou
Canwen Xu
Tao Ge
Julian McAuley
Ke Xu
Furu Wei
ArXivPDFHTML

Papers citing "BERT Loses Patience: Fast and Robust Inference with Early Exit"

50 / 80 papers shown
Title
The Energy Cost of Reasoning: Analyzing Energy Usage in LLMs with Test-time Compute
The Energy Cost of Reasoning: Analyzing Energy Usage in LLMs with Test-time Compute
Yunho Jin
Gu-Yeon Wei
David Brooks
LRM
7
0
0
20 May 2025
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps
Jie Ou
Jinyu Guo
Shuaihong Jiang
Zhaokun Wang
Libo Qin
Shunyu Yao
Wenhong Tian
3DV
22
0
0
19 May 2025
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Towards Understanding How Knowledge Evolves in Large Vision-Language Models
Sudong Wang
Yuyao Zhang
Yao Zhu
Jianing Li
Zizhe Wang
Yi Liu
Xiangyang Ji
190
0
0
31 Mar 2025
Language Models Can Predict Their Own Behavior
Language Models Can Predict Their Own Behavior
Dhananjay Ashok
Jonathan May
ReLM
AI4TS
LRM
63
0
0
18 Feb 2025
Large Language Models Are Human-Like Internally
Large Language Models Are Human-Like Internally
Tatsuki Kuribayashi
Yohei Oseki
Souhaib Ben Taieb
Kentaro Inui
Timothy Baldwin
73
5
0
03 Feb 2025
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Divya J. Bajpai
M. Hanawal
80
0
0
02 Feb 2025
Hyper-multi-step: The Truth Behind Difficult Long-context Tasks
Hyper-multi-step: The Truth Behind Difficult Long-context Tasks
Yijiong Yu
Ma Xiufa
Fang Jianwei
Zhi-liang Xu
Su Guangyao
...
Zhixiao Qi
Wei Wang
Wen Liu
Ran Chen
Ji Pei
LRM
RALM
35
0
0
06 Oct 2024
Membership Inference Attack Against Masked Image Modeling
Membership Inference Attack Against Masked Image Modeling
Zehan Li
Xinlei He
Ning Yu
Yang Zhang
46
1
0
13 Aug 2024
Accelerating Large Language Model Inference with Self-Supervised Early
  Exits
Accelerating Large Language Model Inference with Self-Supervised Early Exits
Florian Valade
LRM
44
1
0
30 Jul 2024
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
DistilDoc: Knowledge Distillation for Visually-Rich Document Applications
Jordy Van Landeghem
Subhajit Maity
Ayan Banerjee
Matthew Blaschko
Marie-Francine Moens
Josep Lladós
Sanket Biswas
52
2
0
12 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech
  Representation Models
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
T. Lin
Hung-yi Lee
Hao Tang
53
1
0
08 Jun 2024
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU
  Heterogeneity
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Tyler Griggs
Xiaoxuan Liu
Jiaxiang Yu
Doyoung Kim
Wei-Lin Chiang
Alvin Cheung
Ion Stoica
54
16
0
22 Apr 2024
Exploring Dynamic Transformer for Efficient Object Tracking
Exploring Dynamic Transformer for Efficient Object Tracking
Jiawen Zhu
Xin Chen
Haiwen Diao
Shuai Li
Jun-Yan He
Chenyang Li
Bin Luo
Dong Wang
Huchuan Lu
60
2
0
26 Mar 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
40
3
0
25 Mar 2024
LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed
  Tasks in the Wild
LoraRetriever: Input-Aware LoRA Retrieval and Composition for Mixed Tasks in the Wild
Ziyu Zhao
Leilei Gan
Guoyin Wang
Wangchunshu Zhou
Hongxia Yang
Kun Kuang
Fei Wu
MoMe
26
31
0
15 Feb 2024
DE$^3$-BERT: Distance-Enhanced Early Exiting for BERT based on
  Prototypical Networks
DE3^33-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
40
3
0
03 Feb 2024
Investigating Recurrent Transformers with Dynamic Halt
Investigating Recurrent Transformers with Dynamic Halt
Jishnu Ray Chowdhury
Cornelia Caragea
43
1
0
01 Feb 2024
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language
  Models with 3D Parallelism
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
41
30
0
08 Dec 2023
Subnetwork-to-go: Elastic Neural Network with Dynamic Training and
  Customizable Inference
Subnetwork-to-go: Elastic Neural Network with Dynamic Training and Customizable Inference
Kai Li
Yi Luo
34
2
0
06 Dec 2023
PAUMER: Patch Pausing Transformer for Semantic Segmentation
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
François Fleuret
44
2
0
01 Nov 2023
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large
  Language Models by Extrapolating Errors from Small Models
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models
Ruida Wang
Wangchunshu Zhou
Mrinmaya Sachan
29
32
0
20 Oct 2023
AutoMix: Automatically Mixing Language Models
AutoMix: Automatically Mixing Language Models
Pranjal Aggarwal
Aman Madaan
Ankit Anand
Srividya Pranavi Potharaju
Swaroop Mishra
...
Karthik Kappaganthu
Yiming Yang
Shyam Upadhyay
Manaal Faruqui
Mausam
42
19
0
19 Oct 2023
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
DecoderLens: Layerwise Interpretation of Encoder-Decoder Transformers
Anna Langedijk
Hosein Mohebbi
Gabriele Sarti
Willem H. Zuidema
Jaap Jumelet
32
10
0
05 Oct 2023
SplitEE: Early Exit in Deep Neural Networks with Split Computing
SplitEE: Early Exit in Deep Neural Networks with Split Computing
Divya J. Bajpai
Vivek K. Trivedi
S. L. Yadav
M. Hanawal
28
5
0
17 Sep 2023
Dynamic nsNet2: Efficient Deep Noise Suppression with Early Exiting
Dynamic nsNet2: Efficient Deep Noise Suppression with Early Exiting
Riccardo Miccini
Alaa Zniber
Clément Laroche
Tobias Piechowiak
Martin Schoeberl
Luca Pezzarossa
Ouassim Karrakchou
J. Sparsø
Mounir Ghogho
33
1
0
31 Aug 2023
Mobile Foundation Model as Firmware
Mobile Foundation Model as Firmware
Jinliang Yuan
Chenchen Yang
Dongqi Cai
Shihe Wang
Xin Yuan
...
Di Zhang
Hanzi Mei
Xianqing Jia
Shangguang Wang
Mengwei Xu
42
19
0
28 Aug 2023
Deep Model Compression Also Helps Models Capture Ambiguity
Deep Model Compression Also Helps Models Capture Ambiguity
Hancheol Park
Jong C. Park
33
1
0
12 Jun 2023
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
PuMer: Pruning and Merging Tokens for Efficient Vision Language Models
Qingqing Cao
Bhargavi Paranjape
Hannaneh Hajishirzi
MLLM
VLM
18
21
0
27 May 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and
  Multi-label text Classification Tasks
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
28
12
0
21 May 2023
Lifting the Curse of Capacity Gap in Distilling Language Models
Lifting the Curse of Capacity Gap in Distilling Language Models
Chen Zhang
Yang Yang
Jiahao Liu
Jingang Wang
Yunsen Xian
Benyou Wang
Dawei Song
MoE
32
19
0
20 May 2023
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning
  and Coding with LLMs
Let's Sample Step by Step: Adaptive-Consistency for Efficient Reasoning and Coding with LLMs
Pranjal Aggarwal
Aman Madaan
Yiming Yang
Mausam
LRM
38
38
0
19 May 2023
Gradient-Free Structured Pruning with Unlabeled Data
Gradient-Free Structured Pruning with Unlabeled Data
Azade Nova
H. Dai
Dale Schuurmans
SyDa
40
20
0
07 Mar 2023
Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks
Aegis: Mitigating Targeted Bit-flip Attacks against Deep Neural Networks
Jialai Wang
Ziyuan Zhang
Meiqi Wang
Han Qiu
Tianwei Zhang
Qi Li
Zongpeng Li
Tao Wei
Chao Zhang
AAML
22
20
0
27 Feb 2023
Towards Inference Efficient Deep Ensemble Learning
Towards Inference Efficient Deep Ensemble Learning
Ziyue Li
Kan Ren
Yifan Yang
Xinyang Jiang
Yuqing Yang
Dongsheng Li
BDL
29
12
0
29 Jan 2023
Adaptive Deep Neural Network Inference Optimization with EENet
Adaptive Deep Neural Network Inference Optimization with EENet
Fatih Ilhan
Ka-Ho Chow
Sihao Hu
Tiansheng Huang
Selim Tekin
...
Myungjin Lee
Ramana Rao Kompella
Hugo Latapie
Gan Liu
Ling Liu
41
11
0
15 Jan 2023
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for
  Click-Through Rate Prediction
AdaEnsemble: Learning Adaptively Sparse Structured Ensemble Network for Click-Through Rate Prediction
Yachen Yan
Liubo Li
22
3
0
06 Jan 2023
Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network
  in Edge Computing
Mind Your Heart: Stealthy Backdoor Attack on Dynamic Deep Neural Network in Edge Computing
Tian Dong
Ziyuan Zhang
Han Qiu
Tianwei Zhang
Hewu Li
T. Wang
AAML
28
6
0
22 Dec 2022
Vision Transformer Computation and Resilience for Dynamic Inference
Vision Transformer Computation and Resilience for Dynamic Inference
Kavya Sreedhar
Jason Clemons
Rangharajan Venkatesan
S. Keckler
M. Horowitz
32
2
0
06 Dec 2022
Understanding the Robustness of Multi-Exit Models under Common
  Corruptions
Understanding the Robustness of Multi-Exit Models under Common Corruptions
Akshay Mehra
Skyler Seto
Navdeep Jaitly
B. Theobald
AAML
24
3
0
03 Dec 2022
Towards Practical Few-shot Federated NLP
Towards Practical Few-shot Federated NLP
Dongqi Cai
Yaozong Wu
Haitao Yuan
Shangguang Wang
F. Lin
Mengwei Xu
FedML
42
6
0
01 Dec 2022
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating
  Unified Vision Language Model
You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model
Sheng Tang
Yaqing Wang
Zhenglun Kong
Tianchi Zhang
Yao Li
Caiwen Ding
Yanzhi Wang
Yi Liang
Dongkuan Xu
33
32
0
21 Nov 2022
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Avoid Overthinking in Self-Supervised Models for Speech Recognition
Dan Berrebbi
Brian Yan
Shinji Watanabe
LRM
26
4
0
01 Nov 2022
Efficient Graph Neural Network Inference at Large Scale
Efficient Graph Neural Network Inference at Large Scale
Xin-pu Gao
Wentao Zhang
Yingxia Shao
Quoc Viet Hung Nguyen
Bin Cui
Hongzhi Yin
AI4CE
GNN
62
8
0
01 Nov 2022
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency
  with Slenderized Multi-exit Language Models
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
Bowen Shen
Zheng Lin
Yuanxin Liu
Zhengxiao Liu
Lei Wang
Weiping Wang
VLM
52
4
0
27 Oct 2022
Hidden State Variability of Pretrained Language Models Can Guide
  Computation Reduction for Transfer Learning
Hidden State Variability of Pretrained Language Models Can Guide Computation Reduction for Transfer Learning
Shuo Xie
Jiahao Qiu
Ankita Pasad
Li Du
Qing Qu
Hongyuan Mei
37
16
0
18 Oct 2022
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural
  Networks on Edge NPUs
Fluid Batching: Exit-Aware Preemptive Serving of Early-Exit Neural Networks on Edge NPUs
Alexandros Kouris
Stylianos I. Venieris
Stefanos Laskaridis
Nicholas D. Lane
42
8
0
27 Sep 2022
Building an Efficiency Pipeline: Commutativity and Cumulativeness of
  Efficiency Operators for Transformers
Building an Efficiency Pipeline: Commutativity and Cumulativeness of Efficiency Operators for Transformers
Ji Xin
Raphael Tang
Zhiying Jiang
Yaoliang Yu
Jimmy J. Lin
20
1
0
31 Jul 2022
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
VLUE: A Multi-Task Benchmark for Evaluating Vision-Language Models
Wangchunshu Zhou
Yan Zeng
Shizhe Diao
Xinsong Zhang
CoGe
VLM
32
13
0
30 May 2022
Transkimmer: Transformer Learns to Layer-wise Skim
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
80
38
0
15 May 2022
PALBERT: Teaching ALBERT to Ponder
PALBERT: Teaching ALBERT to Ponder
Nikita Balagansky
Daniil Gavrilov
MoE
29
6
0
07 Apr 2022
12
Next