ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown
Title
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
94
3
0
28 Feb 2024
SparseLLM: Towards Global Pruning for Pre-trained Language Models
SparseLLM: Towards Global Pruning for Pre-trained Language Models
Guangji Bai
Yijiang Li
Chen Ling
Kibaek Kim
Liang Zhao
135
11
0
28 Feb 2024
REPrune: Channel Pruning via Kernel Representative Selection
REPrune: Channel Pruning via Kernel Representative Selection
Mincheol Park
Dongjin Kim
Cheonjun Park
Yuna Park
Gyeong Eun Gong
Won Woo Ro
Suhyun Kim
VLM
73
1
0
27 Feb 2024
GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning
GenAINet: Enabling Wireless Collective Intelligence via Knowledge Transfer and Reasoning
Han Zou
Qiyang Zhao
Lina Bariah
Yu Tian
M. Bennis
S. Lasaulce
158
14
0
26 Feb 2024
EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural
  Network Acceleration
EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration
Bo Liu
Grace Li Zhang
Xunzhao Yin
Ulf Schlichtmann
Bing Li
MQAI4CE
77
0
0
25 Feb 2024
Model Compression Method for S4 with Diagonal State Space Layers using
  Balanced Truncation
Model Compression Method for S4 with Diagonal State Space Layers using Balanced Truncation
Haruka Ezoe
Kazuhiro Sato
61
0
0
25 Feb 2024
Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural
  Networks Using the Marginal Likelihood
Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood
Rayen Dhahri
Alexander Immer
Bertrand Charpentier
Stephan Günnemann
Vincent Fortuin
BDLUQCV
85
5
0
25 Feb 2024
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM
  Fine-Tuning
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Yong Liu
Zirui Zhu
Chaoyu Gong
Minhao Cheng
Cho-Jui Hsieh
Yang You
MoE
86
23
0
24 Feb 2024
How Do Nonlinear Transformers Learn and Generalize in In-Context
  Learning?
How Do Nonlinear Transformers Learn and Generalize in In-Context Learning?
Hongkang Li
Meng Wang
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
MLT
121
18
0
23 Feb 2024
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning
NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning
Dhananjay Saikumar
Blesson Varghese
72
1
0
21 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
162
32
0
21 Feb 2024
Tiny Reinforcement Learning for Quadruped Locomotion using Decision
  Transformers
Tiny Reinforcement Learning for Quadruped Locomotion using Decision Transformers
Orhan Eren Akgün
Néstor Cuevas
Matheus Farias
Daniel Garces
101
0
0
20 Feb 2024
A Survey on Knowledge Distillation of Large Language Models
A Survey on Knowledge Distillation of Large Language Models
Xiaohan Xu
Ming Li
Chongyang Tao
Tao Shen
Reynold Cheng
Jinyang Li
Can Xu
Dacheng Tao
Dinesh Manocha
KELMVLM
175
135
0
20 Feb 2024
In value-based deep reinforcement learning, a pruned network is a good
  network
In value-based deep reinforcement learning, a pruned network is a good network
J. Obando-Ceron
Rameswar Panda
Pablo Samuel Castro
OffRL
132
26
0
19 Feb 2024
Acquiring Clean Language Models from Backdoor Poisoned Datasets by
  Downscaling Frequency Space
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space
Zongru Wu
Zhuosheng Zhang
Pengzhou Cheng
Gongshen Liu
AAML
134
6
0
19 Feb 2024
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Zhuoming Chen
Avner May
Ruslan Svirschevski
Yuhsun Huang
Max Ryabinin
Zhihao Jia
Beidi Chen
108
52
0
19 Feb 2024
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM
  Fine-Tuning: A Benchmark
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Yihua Zhang
Pingzhi Li
Junyuan Hong
Jiaxiang Li
Yimeng Zhang
...
Wotao Yin
Mingyi Hong
Zhangyang Wang
Sijia Liu
Tianlong Chen
135
60
0
18 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
114
3
0
18 Feb 2024
Generalizability of Mixture of Domain-Specific Adapters from the Lens of
  Signed Weight Directions and its Application to Effective Model Pruning
Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning
Tuc Nguyen
Thai Le
MoMe
95
3
0
16 Feb 2024
BitDelta: Your Fine-Tune May Only Be Worth One Bit
BitDelta: Your Fine-Tune May Only Be Worth One Bit
James Liu
Guangxuan Xiao
Kai Li
Jason D. Lee
Song Han
Tri Dao
Tianle Cai
75
26
0
15 Feb 2024
HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM
  Inference
HiRE: High Recall Approximate Top-kkk Estimation for Efficient LLM Inference
Yashas Samaga
Varun Yerram
Chong You
Srinadh Bhojanapalli
Sanjiv Kumar
Prateek Jain
Praneeth Netrapalli
89
5
0
14 Feb 2024
FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via
  Large Language Models
FL-NAS: Towards Fairness of NAS for Resource Constrained Devices via Large Language Models
Ruiyang Qin
Yuting Hu
Zheyu Yan
Jinjun Xiong
Ahmed Abbasi
Yiyu Shi
71
7
0
09 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
112
36
0
08 Feb 2024
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank
  Modifications
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei
Kaixuan Huang
Yangsibo Huang
Tinghao Xie
Xiangyu Qi
Mengzhou Xia
Prateek Mittal
Mengdi Wang
Peter Henderson
AAML
162
118
0
07 Feb 2024
EfficientViT-SAM: Accelerated Segment Anything Model Without Accuracy
  Loss
EfficientViT-SAM: Accelerated Segment Anything Model Without Accuracy Loss
Zhuoyang Zhang
Han Cai
Song Han
VLM
72
3
0
07 Feb 2024
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large
  Language Models
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models
Hyesung Jeon
Yulhwa Kim
Jae-Joon Kim
MQ
64
5
0
07 Feb 2024
Progressive Gradient Flow for Robust N:M Sparsity Training in
  Transformers
Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers
Abhimanyu Bambhaniya
Amir Yazdanbakhsh
Suvinay Subramanian
Sheng-Chun Kao
Shivani Agrawal
Utku Evci
Tushar Krishna
121
19
0
07 Feb 2024
Compressing Deep Reinforcement Learning Networks with a Dynamic
  Structured Pruning Method for Autonomous Driving
Compressing Deep Reinforcement Learning Networks with a Dynamic Structured Pruning Method for Autonomous Driving
Wensheng Su
Zhenni Li
Minrui Xu
Jiawen Kang
Dusit Niyato
Shengli Xie
77
9
0
07 Feb 2024
Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to
  Non-Essential Neurons
Enhance DNN Adversarial Robustness and Efficiency via Injecting Noise to Non-Essential Neurons
Zhenyu Liu
Garrett Gagnon
Swagath Venkataramani
Liu Liu
AAML
72
0
0
06 Feb 2024
Single-GPU GNN Systems: Traps and Pitfalls
Single-GPU GNN Systems: Traps and Pitfalls
Yidong Gong
A. Tarafder
Saima Afrin
Pradeep Kumar
GNN
100
3
0
05 Feb 2024
A Survey on Transformer Compression
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
158
35
0
05 Feb 2024
Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation
Dynamic Sparse Learning: A Novel Paradigm for Efficient Recommendation
Shuyao Wang
Yongduo Sui
Jiancan Wu
Zhi Zheng
Hui Xiong
63
16
0
05 Feb 2024
Ultrafast jet classification on FPGAs for the HL-LHC
Ultrafast jet classification on FPGAs for the HL-LHC
Patrick Odagiu
Zhiqiang Que
Javier Mauricio Duarte
J. Haller
Gregor Kasieczka
...
Arpita Seksaria
S. Summers
A. Sznajder
A. Tapper
Thea Klæboe Årrestad
65
3
0
02 Feb 2024
Lightweight Pixel Difference Networks for Efficient Visual
  Representation Learning
Lightweight Pixel Difference Networks for Efficient Visual Representation Learning
Z. Su
Jiehua Zhang
Longguang Wang
Hua Zhang
Zhen Liu
M. Pietikäinen
Li Liu
92
22
0
01 Feb 2024
EPSD: Early Pruning with Self-Distillation for Efficient Model
  Compression
EPSD: Early Pruning with Self-Distillation for Efficient Model Compression
Dong Chen
Ning Liu
Yichen Zhu
Zhengping Che
Rui Ma
Fachao Zhang
Xiaofeng Mou
Yi Chang
Jian Tang
67
4
0
31 Jan 2024
Effect of Weight Quantization on Learning Models by Typical Case
  Analysis
Effect of Weight Quantization on Learning Models by Typical Case Analysis
Shuhei Kashiwamura
Ayaka Sakata
Masaaki Imaizumi
MQ
73
1
0
30 Jan 2024
One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware
  Quantization Training
One-Step Forward and Backtrack: Overcoming Zig-Zagging in Loss-Aware Quantization Training
Lianbo Ma
Yuee Zhou
Jianlun Ma
Guo-Ding Yu
Qing Li
MQ
52
2
0
30 Jan 2024
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond
  the Memory Budget
SwapNet: Efficient Swapping for DNN Inference on Edge AI Devices Beyond the Memory Budget
Kun Wang
Jiani Cao
Zimu Zhou
Zhenjiang Li
67
7
0
30 Jan 2024
Security and Privacy Challenges of Large Language Models: A Survey
Security and Privacy Challenges of Large Language Models: A Survey
B. Das
M. H. Amini
Yanzhao Wu
PILMELM
138
145
0
30 Jan 2024
Do deep neural networks utilize the weight space efficiently?
Do deep neural networks utilize the weight space efficiently?
Onur Can Koyun
B. U. Toreyin
59
0
0
26 Jan 2024
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos
Maximilian L. Croci
Marcelo Gennari do Nascimento
Torsten Hoefler
James Hensman
VLM
220
186
0
26 Jan 2024
MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision
  Transformer
MPTQ-ViT: Mixed-Precision Post-Training Quantization for Vision Transformer
Y. Tai
An-Yeu Wu
Wu
MQ
108
6
0
26 Jan 2024
Marabou 2.0: A Versatile Formal Analyzer of Neural Networks
Marabou 2.0: A Versatile Formal Analyzer of Neural Networks
Haoze Wu
Omri Isac
Aleksandar Zeljić
Teruhiro Tagomori
M. Daggitt
...
Min Wu
Min Zhang
Ekaterina Komendantskaya
Guy Katz
Clark W. Barrett
140
42
0
25 Jan 2024
Communication-Efficient Federated Learning through Adaptive Weight
  Clustering and Server-Side Distillation
Communication-Efficient Federated Learning through Adaptive Weight Clustering and Server-Side Distillation
Vasileios Tsouvalas
Aaqib Saeed
T. Ozcelebi
N. Meratnia
FedML
71
7
0
25 Jan 2024
Dynamic Layer Tying for Parameter-Efficient Transformers
Dynamic Layer Tying for Parameter-Efficient Transformers
Tamir David Hay
Lior Wolf
75
3
0
23 Jan 2024
APT: Adaptive Pruning and Tuning Pretrained Language Models for
  Efficient Training and Inference
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
Bowen Zhao
Hannaneh Hajishirzi
Qingqing Cao
130
21
0
22 Jan 2024
Robustness to distribution shifts of compressed networks for edge
  devices
Robustness to distribution shifts of compressed networks for edge devices
Lulan Shen
Ali Edalati
Brett H. Meyer
Warren Gross
James J. Clark
84
0
0
22 Jan 2024
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on
  ReRAM
Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM
Bingbing Li
Geng Yuan
Zigeng Wang
Shaoyi Huang
Hongwu Peng
Payman Behnam
Wujie Wen
Hang Liu
Caiwen Ding
61
6
0
22 Jan 2024
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
OnDev-LCT: On-Device Lightweight Convolutional Transformers towards federated learning
Chu Myaet Thwal
Minh N. H. Nguyen
Ye Lin Tun
Seongjin Kim
My T. Thai
Choong Seon Hong
125
7
0
22 Jan 2024
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
PRILoRA: Pruned and Rank-Increasing Low-Rank Adaptation
Nadav Benedek
Lior Wolf
87
5
0
20 Jan 2024
Previous
123...8910...686970
Next