Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1709.01686
Cited By
BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks
6 September 2017
Surat Teerapittayanon
Bradley McDanel
H. T. Kung
UQCV
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"BranchyNet: Fast Inference via Early Exiting from Deep Neural Networks"
50 / 251 papers shown
Title
HeurAgenix: Leveraging LLMs for Solving Complex Combinatorial Optimization Challenges
Xianliang Yang
Ling Zhang
Haolong Qian
Lei Song
Jiang Bian
19
0
0
18 Jun 2025
GroupNL: Low-Resource and Robust CNN Design over Cloud and Device
Chuntao Ding
Jianhang Xie
Junna Zhang
Salman Raza
Shangguang Wang
Jiannong Cao
OOD
44
0
0
14 Jun 2025
The Effect of Stochasticity in Score-Based Diffusion Sampling: a KL Divergence Analysis
Bernardo P. Schaeffer
Ricardo M. S. Rosa
Glauco Valle
DiffM
15
0
0
13 Jun 2025
Efficiency Robustness of Dynamic Deep Learning Systems
Ravishka Rathnasuriya
Tingxi Li
Zexin Xu
Zihe Song
Mirazul Haque
Simin Chen
Wei Yang
AAML
SILM
150
0
0
12 Jun 2025
Revisit What You See: Disclose Language Prior in Vision Tokens for Efficient Guided Decoding of LVLMs
Beomsik Cho
Jaehyung Kim
68
0
0
11 Jun 2025
FREE: Fast and Robust Vision Language Models with Early Exits
Divya J. Bajpai
M. Hanawal
VLM
19
0
0
07 Jun 2025
AdaDecode: Accelerating LLM Decoding with Adaptive Layer Parallelism
Zhepei Wei
Wei-Lin Chen
Xinyu Zhu
Yu Meng
OffRL
117
0
0
04 Jun 2025
Multi-Exit Kolmogorov-Arnold Networks: enhancing accuracy and parsimony
James P. Bagrow
Josh Bongard
41
0
0
03 Jun 2025
Matryoshka Model Learning for Improved Elastic Student Models
Chetan Verma
Aditya Srinivas Timmaraju
Cho-Jui Hsieh
Suyash Damle
Ngot Bui
Y. Zhang
Wen Chen
Xin Liu
Prateek Jain
Inderjit S Dhillon
109
0
0
29 May 2025
Leveraging Stochastic Depth Training for Adaptive Inference
Guilherme Korol
Antonio Carlos Schneider Beck
Jeronimo Castrillon
187
0
0
23 May 2025
Accelerating Adaptive Retrieval Augmented Generation via Instruction-Driven Representation Reduction of Retrieval Overlaps
Jie Ou
Jinyu Guo
Shuaihong Jiang
Zhaokun Wang
Libo Qin
Shunyu Yao
Wenhong Tian
3DV
163
0
0
19 May 2025
Onboard Optimization and Learning: A Survey
Monirul Islam Pavel
Siyi Hu
Mahardhika Pratama
Ryszard Kowalczyk
68
0
0
07 May 2025
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
160
0
0
05 May 2025
DPNet: Dynamic Pooling Network for Tiny Object Detection
Luqi Gong
Haotian Chen
Yushen Chen
Tianliang Yao
Chao Li
Shuai Zhao
Guangjie Han
ObjD
452
0
0
05 May 2025
DyDiT++: Dynamic Diffusion Transformers for Efficient Visual Generation
Wangbo Zhao
Yizeng Han
Jiasheng Tang
Kai Wang
Hao Luo
Yibing Song
Gao Huang
Fan Wang
Yang You
163
0
0
09 Apr 2025
Dynamic Pricing for On-Demand DNN Inference in the Edge-AI Market
Songyuan Li
Jia Hu
Geyong Min
Haojun Huang
Jiwei Huang
90
0
0
06 Mar 2025
EPEE: Towards Efficient and Effective Foundation Models in Biomedicine
Zaifu Zhan
Shuang Zhou
Huixue Zhou
Ziqiang Liu
Rui Zhang
85
1
0
03 Mar 2025
AgroLLM: Connecting Farmers and Agricultural Practices through Large Language Models for Enhanced Knowledge Transfer and Practical Application
Dinesh Jackson Samuel
Inna Skarga-Bandurova
David Sikolia
Muhammad Awais
82
1
0
28 Feb 2025
The Representation and Recall of Interwoven Structured Knowledge in LLMs: A Geometric and Layered Analysis
Ge Lei
Samuel J. Cooper
KELM
89
0
0
15 Feb 2025
DistrEE: Distributed Early Exit of Deep Neural Network Inference on Edge Devices
Xian Peng
Xin Wu
Lianming Xu
Li Wang
Aiguo Fei
71
0
0
06 Feb 2025
BEEM: Boosting Performance of Early Exit DNNs using Multi-Exit Classifiers as Experts
Divya J. Bajpai
M. Hanawal
158
1
0
02 Feb 2025
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits
Xiaolin Li
Binhua Huang
B. Cardiff
Deepu John
71
0
0
31 Jan 2025
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
Assaf Lahiany
Yehudit Aperstein
93
4
0
07 Jan 2025
CE-CoLLM: Efficient and Adaptive Large Language Models Through Cloud-Edge Collaboration
Hongpeng Jin
Yanzhao Wu
159
5
0
05 Nov 2024
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Chenxi Wang
Xiang Chen
N. Zhang
Bozhong Tian
Haoming Xu
Shumin Deng
Ningyu Zhang
MLLM
LRM
261
10
0
15 Oct 2024
Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models
Xin Zou
Yizhou Wang
Yibo Yan
Yuanhuiyi Lyu
Kening Zheng
...
Junkai Chen
Peijie Jiang
Qingbin Liu
Chang Tang
Xuming Hu
165
8
0
04 Oct 2024
E-QUARTIC: Energy Efficient Edge Ensemble of Convolutional Neural Networks for Resource-Optimized Learning
Le Zhang
Onat Gungor
Flavio Ponzina
T. Rosing
MQ
53
0
0
12 Sep 2024
Network Fission Ensembles for Low-Cost Self-Ensembles
Hojung Lee
Jong-Seok Lee
UQCV
173
2
0
05 Aug 2024
How to Train Your Multi-Exit Model? Analyzing the Impact of Training Strategies
Bartłomiej Krzepkowski
Bartosz Wójcik
Franciszek Szarwacki
Piotr Kubaty
Jary Pomponi
Tomasz Trzciñski
Bartosz Wójcik
85
1
0
19 Jul 2024
S3D: A Simple and Cost-Effective Self-Speculative Decoding Scheme for Low-Memory GPUs
Wei Zhong
Manasa Bharadwaj
115
7
0
30 May 2024
A Comprehensive Survey of Accelerated Generation Techniques in Large Language Models
Mahsa Khoshnoodi
Vinija Jain
Mingye Gao
Malavika Srikanth
Aman Chadha
OffRL
125
5
0
15 May 2024
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Tyler Griggs
Xiaoxuan Liu
Jiaxiang Yu
Doyoung Kim
Wei-Lin Chiang
Alvin Cheung
Ion Stoica
118
18
0
22 Apr 2024
Accelerating Inference in Large Language Models with a Unified Layer Skipping Strategy
Yijin Liu
Fandong Meng
Jie Zhou
AI4CE
81
9
0
10 Apr 2024
Radial Networks: Dynamic Layer Routing for High-Performance Large Language Models
Jordan Dotzel
Yash Akhauri
Ahmed S. AbouElhamayed
Carly Jiang
Mohamed S. Abdelfattah
Zhiru Zhang
MoE
28
2
0
07 Apr 2024
Tiny Models are the Computational Saver for Large Models
Qingyuan Wang
B. Cardiff
Antoine Frappé
Benoît Larras
Deepu John
147
2
0
26 Mar 2024
On the Impact of Black-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Emad Fallahzadeh
Bram Adams
Ahmed E. Hassan
MQ
155
3
0
25 Mar 2024
UR2M: Uncertainty and Resource-Aware Event Detection on Microcontrollers
Hong Jia
Young D. Kwon
Dong Ma
Nhat Pham
Lorena Qendro
Tam N. Vu
Cecilia Mascolo
71
2
0
14 Feb 2024
Understanding the Training Speedup from Sampling with Approximate Losses
Rudrajit Das
Xi Chen
Bertram Ieong
Parikshit Bansal
Sujay Sanghavi
51
2
0
10 Feb 2024
NACHOS: Neural Architecture Search for Hardware Constrained Early Exit Neural Networks
Matteo Gambella
Jary Pomponi
Simone Scardapane
Manuel Roveri
93
2
0
24 Jan 2024
Learning to Compose SuperWeights for Neural Parameter Allocation Search
Piotr Teterwak
Soren Nelson
Nikoli Dryden
D. Bashkirova
Kate Saenko
Bryan A. Plummer
108
2
0
03 Dec 2023
Uncertainty Quantification in Machine Learning for Biosignal Applications -- A Review
Ivo Pascal de Jong
A. Sburlea
Matias Valdenegro-Toro
89
2
0
15 Nov 2023
PAUMER: Patch Pausing Transformer for Semantic Segmentation
Evann Courdier
Prabhu Teja Sivaprasad
François Fleuret
94
2
0
01 Nov 2023
An automated approach for improving the inference latency and energy efficiency of pretrained CNNs by removing irrelevant pixels with focused convolutions
Caleb Tung
Nick Eliopoulos
Purvish Jajal
Gowri Ramshankar
Chen-Yun Yang
Nicholas Synovic
Xuecen Zhang
Vipin Chaudhary
George K. Thiruvathukal
Yu Lu
33
0
0
11 Oct 2023
AdaDiff: Accelerating Diffusion Models through Step-Wise Adaptive Computation
Shengkun Tang
Yaqing Wang
Maksim Dzhigil
Yi Liang
Yongbin Li
Dongkuan Xu
64
7
0
29 Sep 2023
The Grand Illusion: The Myth of Software Portability and Implications for ML Progress
Fraser Mince
Dzung Dinh
Jonas Kgomo
Neil Thompson
Sara Hooker
58
6
0
12 Sep 2023
The Adversarial Implications of Variable-Time Inference
Dudi Biton
Aditi Misra
Efrat Levy
J. Kotak
Ron Bitton
R. Schuster
Nicolas Papernot
Yuval Elovici
Ben Nassi
AAML
34
0
0
05 Sep 2023
Mobile Foundation Model as Firmware
Jinliang Yuan
Chenchen Yang
Dongqi Cai
Shihe Wang
Xin Yuan
...
Di Zhang
Hanzi Mei
Xianqing Jia
Shangguang Wang
Mengwei Xu
120
22
0
28 Aug 2023
Using Early Exits for Fast Inference in Automatic Modulation Classification
E. Mohammed
Omar Mashaal
H. Abou-zeid
37
3
0
22 Aug 2023
CSI-Based Efficient Self-Quarantine Monitoring System Using Branchy Convolution Neural Network
Jingtao Guo
I. W. Ho
17
2
0
24 May 2023
F-PABEE: Flexible-patience-based Early Exiting for Single-label and Multi-label text Classification Tasks
Xiangxiang Gao
Wei-wei Zhu
Jiasheng Gao
Congrui Yin
VLM
92
12
0
21 May 2023
1
2
3
4
5
6
Next