Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,481 papers shown
Title
Optimal Recurrent Network Topologies for Dynamical Systems Reconstruction
Christoph Jürgen Hemmer
Manuel Brenner
Florian Hess
Daniel Durstewitz
104
4
0
07 Jun 2024
Navigating Efficiency in MobileViT through Gaussian Process on Global Architecture Factors
Ke Meng
Kai Chen
67
0
0
07 Jun 2024
How Far Can We Compress Instant-NGP-Based NeRF?
Yihang Chen
Qianyi Wu
Mehrtash Harandi
Jianfei Cai
89
19
0
06 Jun 2024
ReDistill: Residual Encoded Distillation for Peak Memory Reduction of CNNs
Fang Chen
Gourav Datta
Mujahid Al Rafi
Hyeran Jeon
Meng Tang
229
1
0
06 Jun 2024
FusionBench: A Comprehensive Benchmark of Deep Model Fusion
Anke Tang
Li Shen
Yong Luo
Han Hu
Di Lin
Dacheng Tao
ELM
MoMe
VLM
82
27
0
05 Jun 2024
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models
Peijie Dong
Lujun Li
Zhenheng Tang
Xiang Liu
Xinglin Pan
Qiang-qiang Wang
Xiaowen Chu
156
33
0
05 Jun 2024
Toward Efficient Deep Spiking Neuron Networks:A Survey On Compression
Hui Xie
Ge Yang
Wenjuan Gao
85
1
0
03 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
103
10
0
31 May 2024
Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Pallavi Mitra
Gesina Schwalbe
Nadja Klein
AAML
74
1
0
31 May 2024
Self-degraded contrastive domain adaptation for industrial fault diagnosis with bi-imbalanced data
Gecheng Chen
Zeyu Yang
Chengwen Luo
Jian-qiang Li
103
1
0
31 May 2024
LCQ: Low-Rank Codebook based Quantization for Large Language Models
Wen-Pu Cai
Wu-Jun Li
Wu-Jun Li
MQ
122
0
0
31 May 2024
Dual sparse training framework: inducing activation map sparsity via Transformed
ℓ
1
\ell1
ℓ
1
regularization
Xiaolong Yu
Cong Tian
84
0
0
30 May 2024
CiliaGraph: Enabling Expression-enhanced Hyper-Dimensional Computation in Ultra-Lightweight and One-Shot Graph Classification on Edge
Yuxi Han
Jihe Wang
Danghui Wang
81
1
0
29 May 2024
Efficient Model Compression for Hierarchical Federated Learning
Xi Zhu
Songcan Yu
Junbo Wang
Qinglin Yang
FedML
20
0
0
27 May 2024
Scorch: A Library for Sparse Deep Learning
Bobby Yan
Alexander J. Root
Trevor Gale
David Broman
Fredrik Kjolstad
78
1
0
27 May 2024
Extreme Compression of Adaptive Neural Images
Leo Hoshikawa
Marcos V. Conde
Takeshi Ohashi
Atsushi Irie
97
1
0
27 May 2024
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
Mohammed Nowaz Rabbani Chowdhury
Meng Wang
Kaoutar El Maghraoui
Naigang Wang
Pin-Yu Chen
Christopher Carothers
MoE
116
4
0
26 May 2024
Pruning for Robust Concept Erasing in Diffusion Models
Tianyun Yang
Juan Cao
Chang Xu
102
14
0
26 May 2024
Online Resource Allocation for Edge Intelligence with Colocated Model Retraining and Inference
Huaiguang Cai
Zhi Zhou
Qianyi Huang
71
4
0
25 May 2024
BOLD: Boolean Logic Deep Learning
Van Minh Nguyen
Cristian Ocampo
Aymen Askri
Louis Leconte
Ba-Hien Tran
AI4CE
110
1
0
25 May 2024
PatchProt: Hydrophobic patch prediction using protein foundation models
Dea Gogishvili
Emmanuel Minois-Genin
Jan van Eck
Sanne Abeln
53
2
0
24 May 2024
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
Nolan Dey
Shane Bergsma
Joel Hestness
75
5
0
24 May 2024
Embedding Compression for Efficient Re-Identification
Luke McDermott
38
0
0
23 May 2024
CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization
Zi Yang
Samridhi Choudhary
Xinfeng Xie
Cao Gao
Siegfried Kunzmann
Zheng Zhang
VLM
136
10
0
23 May 2024
Efficient Multitask Dense Predictor via Binarization
Yuzhang Shang
Dan Xu
Gaowen Liu
Ramana Rao Kompella
Yan Yan
MQ
AAML
105
1
0
23 May 2024
Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices
Baiyu Pan
Jichao Jiao
Jianxin Pang
Jun Cheng
76
3
0
20 May 2024
Efficiency optimization of large-scale language models based on deep learning in natural language processing tasks
Taiyuan Mei
Yun Zi
X. Cheng
Zijun Gao
Qi Wang
Haowei Yang
121
20
0
20 May 2024
Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning
Mohammad Hasan Ahmadilivani
Seyedhamidreza Mousavi
J. Raik
Masoud Daneshtalab
M. Jenihhin
AAML
70
3
0
17 May 2024
Memory-efficient Energy-adaptive Inference of Pre-Trained Models on Batteryless Embedded Systems
Pietro Farina
Subrata Biswas
Eren Yildiz
Khakim Akhunov
Saad Ahmed
Bashima Islam
K. Yıldırım
89
4
0
16 May 2024
Unmasking Efficiency: Learning Salient Sparse Models in Non-IID Federated Learning
Riyasat Ohib
Bishal Thapaliya
Gintare Karolina Dziugaite
Jingyu Liu
Vince D. Calhoun
Sergey Plis
FedML
83
1
0
15 May 2024
Neural Network Compression for Reinforcement Learning Tasks
Dmitry A. Ivanov
D. Larionov
Oleg V. Maslennikov
V. Voevodin
OffRL
AI4CE
88
2
0
13 May 2024
From Algorithm to Hardware: A Survey on Efficient and Safe Deployment of Deep Neural Networks
Xue Geng
Zhe Wang
Chunyun Chen
Qing Xu
Kaixin Xu
...
Zhenghua Chen
M. Aly
Jie Lin
Min-man Wu
Xiaoli Li
87
1
0
09 May 2024
Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost
Yuan Gao
Weizhong Zhang
Wenhan Luo
Lin Ma
Jin-Gang Yu
Gui-Song Xia
Jiayi Ma
87
1
0
09 May 2024
Communication-Efficient Collaborative Perception via Information Filling with Codebook
Yue Hu
Juntong Peng
Si Liu
Junhao Ge
Si Liu
Siheng Chen
121
16
0
08 May 2024
Switchable Decision: Dynamic Neural Generation Networks
Shujian Zhang
Korawat Tanwisuth
Chengyue Gong
Pengcheng He
Mi Zhou
BDL
77
0
0
07 May 2024
Collage: Light-Weight Low-Precision Strategy for LLM Training
Tao Yu
Gaurav Gupta
Karthick Gopalswamy
Amith R. Mamidala
Hao Zhou
Jeffrey Huynh
Youngsuk Park
Ron Diamant
Anoop Deoras
Jun Huan
MQ
99
3
0
06 May 2024
Iterative Filter Pruning for Concatenation-based CNN Architectures
Svetlana Pavlitska
Oliver Bagge
Federico Nicolás Peccia
Toghrul Mammadov
J. Marius Zöllner
VLM
3DPC
70
3
0
04 May 2024
Random Masking Finds Winning Tickets for Parameter Efficient Fine-tuning
Jing Xu
Jingzhao Zhang
102
7
0
04 May 2024
Torch2Chip: An End-to-end Customizable Deep Neural Network Compression and Deployment Toolkit for Prototype Hardware Accelerator Design
Jian Meng
Yuan Liao
Anupreetham Anupreetham
Ahmed Hassan
Shixing Yu
Han-Sok Suh
Xiaofeng Hu
Jae-sun Seo
MQ
97
2
0
02 May 2024
Enhancing User Experience in On-Device Machine Learning with Gated Compression Layers
Haiguang Li
Usama Pervaiz
Joseph Antognini
Michal Matuszak
Lawrence Au
Gilles Roux
T. Thormundsson
81
0
0
02 May 2024
COPAL: Continual Pruning in Large Language Generative Models
Srikanth Malla
Joon Hee Choi
Chiho Choi
VLM
CLL
90
2
0
02 May 2024
Learning a Sparse Neural Network using IHT
S. Damadi
Soroush Zolfaghari
Mahdi Rezaie
Jinglai Shen
66
0
0
29 Apr 2024
On TinyML and Cybersecurity: Electric Vehicle Charging Infrastructure Use Case
Fatemeh Dehrouyeh
Li Yang
F. Badrkhani Ajaei
Abdallah Shami
102
9
0
25 Apr 2024
AdaQAT: Adaptive Bit-Width Quantization-Aware Training
Cédric Gernigon
Silviu-Ioan Filip
Olivier Sentieys
Clément Coggiola
Mickael Bruno
43
2
0
22 Apr 2024
QCore: Data-Efficient, On-Device Continual Calibration for Quantized Models -- Extended Version
David Campos
Bin Yang
Tung Kieu
Miao Zhang
Chenjuan Guo
Christian S. Jensen
85
8
0
22 Apr 2024
Data-independent Module-aware Pruning for Hierarchical Vision Transformers
Yang He
Qiufeng Wang
ViT
85
5
0
21 Apr 2024
Parallel Decoding via Hidden Transfer for Lossless Large Language Model Acceleration
Pengfei Wu
Jiahao Liu
Zhuocheng Gong
Qifan Wang
Jinpeng Li
Jingang Wang
Xunliang Cai
Dongyan Zhao
72
3
0
18 Apr 2024
SNP: Structured Neuron-level Pruning to Preserve Attention Scores
Kyunghwan Shim
Jaewoong Yun
Shinkook Choi
44
1
0
18 Apr 2024
Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge
Muhammad Zawish
Paul Albert
Flavio Esposito
Steven Davy
Lizy Abraham
70
0
0
17 Apr 2024
Efficient and accurate neural field reconstruction using resistive memory
Yifei Yu
Shaocong Wang
Woyu Zhang
Xinyuan Zhang
Xiuzhe Wu
...
Zhongrui Wang
Dashan Shang
Qi Liu
Kwang-Ting Cheng
Ming-Yuan Liu
75
0
0
15 Apr 2024
Previous
1
2
3
...
6
7
8
...
68
69
70
Next