Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1709.01686
Cited By
BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
6 September 2017
Surat Teerapittayanon
Bradley McDanel
H. T. Kung
UQCV
Re-assign community
ArXiv
PDF
HTML
Papers citing
"BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks"
50 / 167 papers shown
Title
The Energy Cost of Reasoning: Analyzing Energy Usage in LLMs with Test-time Compute
Yunho Jin
Gu-Yeon Wei
David Brooks
LRM
7
0
0
20 May 2025
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps
Jie Ou
Jinyu Guo
Shuaihong Jiang
Zhaokun Wang
Libo Qin
Shunyu Yao
Wenhong Tian
3DV
22
0
0
19 May 2025
Onboard Optimization and Learning: A Survey
Monirul Islam Pavel
Siyi Hu
Mahardhika Pratama
Ryszard Kowalczyk
26
0
0
07 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
57
0
0
05 May 2025
DPNet: Dynamic Pooling Network for Tiny Object Detection
Luqi Gong
Haotian Chen
Yushen Chen
Tianliang Yao
Chao Li
Shuai Zhao
Guangjie Han
ObjD
215
0
0
05 May 2025
DYNAMAX: Dynamic computing for Transformers and Mamba based architectures
Miguel Nogales
Matteo Gambella
Manuel Roveri
56
0
0
29 Apr 2025
Bi-directional Model Cascading with Proxy Confidence
David Warren
Mark Dras
51
0
0
27 Apr 2025
EPSILON: Adaptive Fault Mitigation in Approximate Deep Neural Network using Statistical Signatures
Khurram Khalil
K. A. Hoque
AAML
81
0
0
24 Apr 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
74
0
0
09 Apr 2025
MoSE: Hierarchical Self-Distillation Enhances Early Layer Embeddings
Andrea Gurioli
Federico Pennino
João Monteiro
Maurizio Gabbrielli
51
0
0
04 Mar 2025
Fast and Accurate Gigapixel Pathological Image Classification with Hierarchical Distillation Multi-Instance Learning
Jiuyang Dong
Junjun Jiang
Kui Jiang
Jiahan Li
Yongbing Zhang
47
0
0
28 Feb 2025
The Representation and Recall of Interwoven Structured Knowledge in LLMs: A Geometric and Layered Analysis
Ge Lei
Samuel J. Cooper
KELM
51
0
0
15 Feb 2025
DistrEE: Distributed Early Exit of Deep Neural Network Inference on Edge Devices
Xian Peng
Xin Wu
Lianming Xu
Li Wang
Aiguo Fei
36
0
0
06 Feb 2025
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Divya J. Bajpai
M. Hanawal
80
0
0
02 Feb 2025
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits
Xiaolin Li
Binhua Huang
B. Cardiff
Deepu John
46
0
0
31 Jan 2025
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
Assaf Lahiany
Yehudit Aperstein
33
4
0
07 Jan 2025
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Chenxi Wang
Xiang Chen
N. Zhang
Bozhong Tian
Haoming Xu
Shumin Deng
Huajun Chen
MLLM
LRM
44
4
0
15 Oct 2024
Hyper-multi-step: The Truth Behind Difficult Long-context Tasks
Yijiong Yu
Ma Xiufa
Fang Jianwei
Zhi-liang Xu
Su Guangyao
...
Zhixiao Qi
Wei Wang
Wen Liu
Ran Chen
Ji Pei
LRM
RALM
29
0
0
06 Oct 2024
SOI: Scaling Down Computational Complexity by Estimating Partial States of the Model
Grzegorz Stefański
P. Daniluk
Artur Szumaczuk
Jakub Tkaczuk
31
0
0
04 Oct 2024
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Xin Zou
Yizhou Wang
Yibo Yan
Yuanhuiyi Lyu
Kening Zheng
...
Junkai Chen
Peijie Jiang
Qingbin Liu
Chang Tang
Xuming Hu
89
7
0
04 Oct 2024
Network Fission Ensembles for Low-Cost Self-Ensembles
Hojung Lee
Jong-Seok Lee
UQCV
64
0
0
05 Aug 2024
Accelerating Large Language Model Inference with Self-Supervised Early Exits
Florian Valade
LRM
44
1
0
30 Jul 2024
Learning Motion Blur Robust Vision Transformers with Dynamic Early Exit for Real-Time UAV Tracking
You Wu
Xucheng Wang
Dan Zeng
Hengzhou Ye
Xiaolan Xie
Qijun Zhao
Shuiwang Li
42
3
0
07 Jul 2024
Model Adaptation for Time Constrained Embodied Control
Jaehyun Song
Minjong Yoo
Honguk Woo
55
0
0
17 Jun 2024
DAISY: Data Adaptive Self-Supervised Early Exit for Speech Representation Models
T. Lin
Hung-yi Lee
Hao Tang
53
1
0
08 Jun 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
49
5
0
30 May 2024
Towards Energy-Aware Federated Learning via MARL: A Dual-Selection Approach for Model and Client
Jun Xia
Yi Zhang
Yiyu Shi
33
1
0
13 May 2024
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Tyler Griggs
Xiaoxuan Liu
Jiaxiang Yu
Doyoung Kim
Wei-Lin Chiang
Alvin Cheung
Ion Stoica
54
16
0
22 Apr 2024
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
Yijin Liu
Fandong Meng
Jie Zhou
AI4CE
27
7
0
10 Apr 2024
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
46
2
0
26 Mar 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
40
3
0
25 Mar 2024
Cooperative Learning for Cost-Adaptive Inference
Xingli Fang
Richard M. Bradford
Jung-Eun Kim
45
1
0
13 Dec 2023
EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism
Yanxi Chen
Xuchen Pan
Yaliang Li
Bolin Ding
Jingren Zhou
LRM
41
30
0
08 Dec 2023
Adaptive Early Exiting for Collaborative Inference over Noisy Wireless Channels
Mikolaj Jankowski
Deniz Gunduz
K. Mikolajczyk
32
3
0
29 Nov 2023
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
François Fleuret
44
2
0
01 Nov 2023
SplitEE: Early Exit in Deep Neural Networks with Split Computing
Divya J. Bajpai
Vivek K. Trivedi
S. L. Yadav
M. Hanawal
28
5
0
17 Sep 2023
Mobile Foundation Model as Firmware
Jinliang Yuan
Chenchen Yang
Dongqi Cai
Shihe Wang
Xin Yuan
...
Di Zhang
Hanzi Mei
Xianqing Jia
Shangguang Wang
Mengwei Xu
42
19
0
28 Aug 2023
Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers
Matthew Dutson
Yin Li
M. Gupta
ViT
45
8
0
25 Aug 2023
Using Early Exits for Fast Inference in Automatic Modulation Classification
E. Mohammed
Omar Mashaal
H. Abou-zeid
26
3
0
22 Aug 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
28
12
0
21 May 2023
Towards Carbon-Neutral Edge Computing: Greening Edge AI by Harnessing Spot and Future Carbon Markets
Huirong Ma
Zhi Zhou
Xiaoxi Zhang
Xu Chen
18
12
0
22 Apr 2023
A Survey on Deep Neural Network Partition over Cloud, Edge and End Devices
Di Xu
Xiang He
Tonghua Su
Zhongjie Wang
32
6
0
20 Apr 2023
Big-Little Adaptive Neural Networks on Low-Power Near-Subthreshold Processors
Zichao Shen
Neil Howard
J. Núñez-Yáñez
18
2
0
19 Apr 2023
DynamicDet: A Unified Dynamic Architecture for Object Detection
Zhi-Hao Lin
Yongtao Wang
Jinhe Zhang
Xiaojie Chu
ObjD
25
30
0
12 Apr 2023
Revisiting Single-gated Mixtures of Experts
Amelie Royer
I. Karmanov
Andrii Skliar
B. Bejnordi
Tijmen Blankevoort
MoE
MoMe
41
6
0
11 Apr 2023
SEENN: Towards Temporal Spiking Early-Exit Neural Networks
Yuhang Li
Tamar Geller
Youngeun Kim
Priyadarshini Panda
26
38
0
02 Apr 2023
Memorization Capacity of Neural Networks with Conditional Computation
Erdem Koyuncu
38
4
0
20 Mar 2023
A Dynamic Multi-Scale Voxel Flow Network for Video Prediction
Xiaotao Hu
Zhewei Huang
Ailin Huang
Jun Xu
Shuchang Zhou
VGen
38
69
0
17 Mar 2023
Adaptive Rotated Convolution for Rotated Object Detection
Yifan Pu
Yiru Wang
Zhuofan Xia
Yizeng Han
Yulin Wang
Weihao Gan
Zidong Wang
S. Song
Gao Huang
25
76
0
14 Mar 2023
Map-and-Conquer: Energy-Efficient Mapping of Dynamic Neural Nets onto Heterogeneous MPSoCs
Halima Bouzidi
Mohanad Odema
Hamza Ouarnoughi
Smail Niar
Mohammad Abdullah Al Faruque
29
8
0
24 Feb 2023
1
2
3
4
Next