Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
v1
v2
v3
v4
v5 (latest)
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,481 papers shown
Title
Hybrid FedGraph: An efficient hybrid federated learning algorithm using graph convolutional neural network
Jaeyeon Jang
Diego Klabjan
Veena Mendiratta
Fanfei Meng
FedML
62
1
0
15 Apr 2024
SQUAT: Stateful Quantization-Aware Training in Recurrent Spiking Neural Networks
Sreyes P. Venkatesh
Razvan Marinescu
Jason K. Eshraghian
MQ
117
5
0
15 Apr 2024
Bullion: A Column Store for Machine Learning
Gang Liao
Ye Liu
Jianjun Chen
Daniel J. Abadi
77
5
0
13 Apr 2024
Transferable and Principled Efficiency for Open-Vocabulary Segmentation
Jingxuan Xu
Wuyang Chen
Yao-Min Zhao
Yunchao Wei
VLM
107
2
0
11 Apr 2024
Lossless Acceleration of Large Language Model via Adaptive N-gram Parallel Decoding
Jie Ou
Yueming Chen
Wenhong Tian
125
17
0
10 Apr 2024
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou
Qingyang Wang
Han Zhao
Tingfeng Liu
Yi Yang
Yangdong Deng
107
0
0
10 Apr 2024
TabConv: Low-Computation CNN Inference via Table Lookups
Neelesh Gupta
Narayanan Kannan
Pengmiao Zhang
Viktor Prasanna
77
2
0
08 Apr 2024
MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning
Matteo Farina
Massimiliano Mancini
Elia Cunegatti
Gaowen Liu
Giovanni Iacca
Elisa Ricci
VLM
84
2
0
08 Apr 2024
Lightweight Deep Learning for Resource-Constrained Environments: A Survey
Hou-I Liu
Marco Galindo
Hongxia Xie
Lai-Kuan Wong
Hong-Han Shuai
Yung-Hui Li
Wen-Huang Cheng
138
67
0
08 Apr 2024
What Happens When Small Is Made Smaller? Exploring the Impact of Compression on Small Data Pretrained Language Models
Busayo Awobade
Mardiyyah Oduwole
Steven Kolawole
75
1
0
06 Apr 2024
Dynamic Switch Layers For Unsupervised Learning
Haiguang Li
Usama Pervaiz
Michal Matuszak
Robert Kamara
Gilles Roux
T. Thormundsson
Joseph Antognini
134
1
0
05 Apr 2024
Lossless and Near-Lossless Compression for Foundation Models
Moshik Hershcovitch
Leshem Choshen
Andrew Wood
Ilias Enmouri
Peter Chin
S. Sundararaman
Danny Harnik
92
6
0
05 Apr 2024
On the Surprising Efficacy of Distillation as an Alternative to Pre-Training Small Models
Sean Farhat
Deming Chen
118
0
0
04 Apr 2024
Talaria: Interactively Optimizing Machine Learning Models for Efficient Inference
Fred Hohman
Chaoqun Wang
Jinmook Lee
Jochen Görtler
Dominik Moritz
Jeffrey P. Bigham
Zhile Ren
Cecile Foret
Qi Shan
Xiaoyi Zhang
116
7
0
03 Apr 2024
Optimizing the Deployment of Tiny Transformers on Low-Power MCUs
Victor J. B. Jung
Luca Bompani
Moritz Scherer
Francesco Conti
Luca Benini
117
5
0
03 Apr 2024
Rethinking Pruning for Vision-Language Models: Strategies for Effective Sparsity and Performance Restoration
Shwai He
Ang Li
Tianlong Chen
VLM
112
1
0
03 Apr 2024
Improve Knowledge Distillation via Label Revision and Data Selection
Weichao Lan
Yiu-ming Cheung
Qing Xu
Buhua Liu
Zhikai Hu
Mengke Li
Zhenghua Chen
74
3
0
03 Apr 2024
Accelerating Transformer Pre-training with 2:4 Sparsity
Yuezhou Hu
Kang Zhao
Weiyu Huang
Jianfei Chen
Jun Zhu
138
9
0
02 Apr 2024
Condition-Aware Neural Network for Controlled Image Generation
Han Cai
Zhekai Zhang
Zhuoyang Zhang
Qinsheng Zhang
Ming-Yu Liu
Song Han
DiffM
58
8
0
01 Apr 2024
Separate, Dynamic and Differentiable (SMART) Pruner for Block/Output Channel Pruning on Computer Vision Tasks
Guanhua Ding
Zexi Ye
Zhen Zhong
Gang Li
David Shao
63
0
0
29 Mar 2024
Tiny Machine Learning: Progress and Futures
Ji Lin
Ligeng Zhu
Wei-Ming Chen
Wei-Chen Wang
Song Han
90
60
0
28 Mar 2024
Dense Vision Transformer Compression with Few Samples
Hanxiao Zhang
Yifan Zhou
Guo-Hua Wang
Jianxin Wu
ViT
VLM
86
5
0
27 Mar 2024
Block Selective Reprogramming for On-device Training of Vision Transformers
Sreetama Sarkar
Souvik Kundu
Kai Zheng
Peter A. Beerel
63
2
0
25 Mar 2024
Parametric PDE Control with Deep Reinforcement Learning and Differentiable L0-Sparse Polynomial Policies
N. Botteghi
Urban Fasel
AI4CE
108
6
0
22 Mar 2024
Hierarchical Skip Decoding for Efficient Autoregressive Text Generation
Yunqi Zhu
Xuebing Yang
Yuanyuan Wu
Wensheng Zhang
124
3
0
22 Mar 2024
FedMef: Towards Memory-efficient Federated Dynamic Pruning
Hong Huang
Weiming Zhuang
Chen Chen
Lingjuan Lyu
115
11
0
21 Mar 2024
Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch
Xidong Wu
Shangqian Gao
Zeyu Zhang
Zhenzhen Li
Runxue Bao
Yanfu Zhang
Xiaoqian Wang
Heng-Chiao Huang
68
11
0
21 Mar 2024
Evaluating Unsupervised Dimensionality Reduction Methods for Pretrained Sentence Embeddings
Gaifan Zhang
Yi Zhou
Danushka Bollegala
61
4
0
20 Mar 2024
Pruning for Improved ADC Efficiency in Crossbar-based Analog In-memory Accelerators
Timur Ibrayev
Isha Garg
I. Chakraborty
Kaushik Roy
46
0
0
19 Mar 2024
SEVEN: Pruning Transformer Model by Reserving Sentinels
Jinying Xiao
Ping Li
Jie Nie
Zhe Tang
74
3
0
19 Mar 2024
EffiPerception: an Efficient Framework for Various Perception Tasks
Xinhao Xiang
Simon Dräger
Jiawei Zhang
VLM
79
0
0
18 Mar 2024
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Junyuan Hong
Jinhao Duan
Chenhui Zhang
Zhangheng Li
Chulin Xie
...
B. Kailkhura
Dan Hendrycks
Dawn Song
Zhangyang Wang
Yue Liu
112
28
0
18 Mar 2024
Federated Learning based on Pruning and Recovery
Chengjie Ma
FedML
28
0
0
16 Mar 2024
BRIEDGE: EEG-Adaptive Edge AI for Multi-Brain to Multi-Robot Interaction
Jinhui Ouyang
Mingzhu Wu
Xinglin Li
Hanhui Deng
Di Wu
60
2
0
14 Mar 2024
Physics-Inspired Deep Learning Anti-Aliasing Framework in Efficient Channel State Feedback
Yu-Chien Lin
Yan Xin
Ta-Sung Lee
Charlie Zhang
Zhang
Zhi Ding
52
1
0
12 Mar 2024
IM-Unpack: Training and Inference with Arbitrarily Low Precision Integers
Zhanpeng Zeng
Karthikeyan Sankaralingam
Vikas Singh
100
1
0
12 Mar 2024
Smart-Infinity: Fast Large Language Model Training using Near-Storage Processing on a Real System
Hongsun Jang
Jaeyong Song
Jaewon Jung
Jaeyoung Park
Youngsok Kim
Jinho Lee
54
16
0
11 Mar 2024
A Converting Autoencoder Toward Low-latency and Energy-efficient DNN Inference at the Edge
Hasanul Mahmud
Peng Kang
Kevin Desai
P. Lama
Sushil Prasad
94
3
0
11 Mar 2024
Enhanced Sparsification via Stimulative Training
Shengji Tang
Weihao Lin
Hancheng Ye
Peng Ye
Chong Yu
Baopu Li
Tao Chen
65
2
0
11 Mar 2024
Exploring Hardware Friendly Bottleneck Architecture in CNN for Embedded Computing Systems
Xing Lei
Longjun Liu
Zhiheng Zhou
Hongbin Sun
Nanning Zheng
99
0
0
11 Mar 2024
FrameQuant: Flexible Low-Bit Quantization for Transformers
Harshavardhan Adepu
Zhanpeng Zeng
Li Zhang
Vikas Singh
MQ
60
8
0
10 Mar 2024
A Survey of Lottery Ticket Hypothesis
Bohan Liu
Zijie Zhang
Peixiong He
Zhensen Wang
Yang Xiao
Ruimeng Ye
Yang Zhou
Wei-Shinn Ku
Bo Hui
UQCV
93
15
0
07 Mar 2024
LORS: Low-rank Residual Structure for Parameter-Efficient Network Stacking
Jialin Li
Qiang Nie
Weifu Fu
Yuhuan Lin
Guangpin Tao
Yong-Jin Liu
Chengjie Wang
96
5
0
07 Mar 2024
Learn to Code Sustainably: An Empirical Study on LLM-based Green Code Generation
Tina Vartziotis
Ippolyti Dellatolas
George Dasoulas
Maximilian Schmidt
Florian Schneider
Tim Hoffmann
S. Kotsopoulos
Michael Keckeisen
136
7
0
05 Mar 2024
MADTP: Multimodal Alignment-Guided Dynamic Token Pruning for Accelerating Vision-Language Transformer
Jianjian Cao
Peng Ye
Shengze Li
Chong Yu
Yansong Tang
Jiwen Lu
Tao Chen
88
22
0
05 Mar 2024
On the Compressibility of Quantized Large Language Models
Yu Mao
Weilan Wang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
80
6
0
03 Mar 2024
OSSCAR: One-Shot Structured Pruning in Vision and Language Models with Combinatorial Optimization
Xiang Meng
Shibal Ibrahim
Kayhan Behdin
Hussein Hazimeh
Natalia Ponomareva
Rahul Mazumder
VLM
104
8
0
02 Mar 2024
BasedAI: A decentralized P2P network for Zero Knowledge Large Language Models (ZK-LLMs)
Sean Wellington
38
5
0
01 Mar 2024
"Lossless" Compression of Deep Neural Networks: A High-dimensional Neural Tangent Kernel Approach
Lingyu Gu
Yongqiang Du
Yuan Zhang
Di Xie
Shiliang Pu
Robert C. Qiu
Zhenyu Liao
98
7
0
01 Mar 2024
T3DNet: Compressing Point Cloud Models for Lightweight 3D Recognition
Zhiyuan Yang
Yunjiao Zhou
Lihua Xie
Jianfei Yang
3DPC
104
1
0
29 Feb 2024
Previous
1
2
3
...
7
8
9
...
68
69
70
Next