ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown
Title
Sample-aware Adaptive Structured Pruning for Large Language Models
Jun Kong
Xinge Ma
Jin Wang
Xuejie Zhang
92
0
0
08 Mar 2025
Wanda++: Pruning Large Language Models via Regional Gradients
Wanda++: Pruning Large Language Models via Regional Gradients
Yifan Yang
Kai Zhen
Bhavana Ganesh
Aram Galstyan
Goeric Huybrechts
...
S. Bodapati
Nathan Susanj
Zheng Zhang
Jack FitzGerald
Abhishek Kumar
231
3
0
06 Mar 2025
Personalized Federated Fine-tuning for Heterogeneous Data: An Automatic Rank Learning Approach via Two-Level LoRA
Jie Hao
Yuman Wu
Ali Payani
Myungjin Lee
Mingrui Liu
126
2
0
05 Mar 2025
FairSense-AI: Responsible AI Meets Sustainability
Shaina Raza
Mukund Sayeeganesh Chettiar
Matin Yousefabadi
Tahniat Khan
Marcelo Lotif
84
0
0
04 Mar 2025
Mamba base PKD for efficient knowledge compression
José Medina
Amnir Hadachi
Paul Honeine
Abdelaziz Bensrhair
Mamba
76
0
0
03 Mar 2025
Privacy-preserving Machine Learning in Internet of Vehicle Applications: Fundamentals, Recent Advances, and Future Direction
Privacy-preserving Machine Learning in Internet of Vehicle Applications: Fundamentals, Recent Advances, and Future Direction
Nazmul Islam
Mohammad Zulkernine
82
0
0
03 Mar 2025
Eau De $Q$-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning
Eau De QQQ-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning
Théo Vincent
Tim Lukas Faust
Yogesh Tripathi
Jan Peters
Carlo DÉramo
78
0
0
03 Mar 2025
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs
Yi-Lin Sung
Prateek Yadav
Jialu Li
Jaehong Yoon
Joey Tianyi Zhou
MQ
103
1
0
03 Mar 2025
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang
Bingqi Shang
Yuchen Li
Payal Mohapatra
Wei Dong
Xiao-Xu Wang
Qi Zhu
ViT
112
1
0
01 Mar 2025
AgroLLM: Connecting Farmers and Agricultural Practices through Large Language Models for Enhanced Knowledge Transfer and Practical Application
Dinesh Jackson Samuel
Inna Skarga-Bandurova
David Sikolia
Muhammad Awais
82
0
0
28 Feb 2025
MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning
MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning
Liang Li
Xingke Yang
Wen Wu
Hao Wang
Tomoaki Ohtsuki
Xin Fu
Miao Pan
Xuemin Shen
73
1
0
27 Feb 2025
Binary Neural Networks for Large Language Model: A Survey
Binary Neural Networks for Large Language Model: A Survey
Liangdong Liu
Zhitong Zheng
Cong Wang
TianHuang Su
ZhenYu Yang
MQ
147
0
0
26 Feb 2025
Mixtraining: A Better Trade-Off Between Compute and Performance
Mixtraining: A Better Trade-Off Between Compute and Performance
Zexin Li
Jiancheng Zhang
Yufei Li
Yinglun Zhu
Cong Liu
71
0
0
26 Feb 2025
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
Weilan Wang
Yu Mao
Dongdong Tang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
126
2
0
24 Feb 2025
Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence
Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence
Bolin Chen
Hanwei Zhu
Shanzhi Yin
Lingyu Zhu
Jie Chen
Ru-Ling Liao
Shiqi Wang
Yan Ye
96
2
0
24 Feb 2025
Machine learning and high dimensional vector search
Matthijs Douze
140
0
0
24 Feb 2025
Optimizing Singular Spectrum for Large Language Model Compression
Dengjie Li
Tiancheng Shen
Yao Zhou
Baisong Yang
Zhongying Liu
Masheng Yang
Guohao Li
Yibo Yang
Yujie Zhong
Ming-Hsuan Yang
88
1
0
24 Feb 2025
"Actionable Help" in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts
"Actionable Help" in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts
Rabindra Lamsal
M. Read
S. Karunasekera
Muhammad Imran
75
0
0
24 Feb 2025
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
Xiaoyi Qu
David Aponte
Colby R. Banbury
Daniel P. Robinson
Tianyu Ding
K. Koishida
Ilya Zharkov
Tianyi Chen
MQ
110
2
0
23 Feb 2025
Verification of Bit-Flip Attacks against Quantized Neural Networks
Verification of Bit-Flip Attacks against Quantized Neural Networks
Yedi Zhang
Lei Huang
Pengfei Gao
Fu Song
Jun Sun
Jin Song Dong
AAML
106
0
0
22 Feb 2025
FedSpaLLM: Federated Pruning of Large Language Models
FedSpaLLM: Federated Pruning of Large Language Models
Guangji Bai
Yijiang Li
Zilinghan Li
Liang Zhao
Kibaek Kim
FedML
143
6
0
20 Feb 2025
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
Boyang Zhang
Daning Cheng
Yunquan Zhang
Meiqi Tu
Fangmin Liu
Jiake Tian
80
1
0
19 Feb 2025
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
Ding-Yong Hong
Tzu-Hsien Tsai
Ning Wang
Pangfeng Liu
Jan-Jan Wu
118
0
0
18 Feb 2025
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs
Minxuan Lv
Zhenpeng Su
Leiyu Pan
Yizhe Xiong
Zijia Lin
...
Guiguang Ding
Cheng Luo
Di Zhang
Kun Gai
Songlin Hu
MoE
120
0
0
18 Feb 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
167
3
0
11 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
284
2
0
10 Feb 2025
Kolmogorov-Arnold Fourier Networks
Kolmogorov-Arnold Fourier Networks
Jusheng Zhang
Yijia Fan
Kaitong Cai
Keze Wang
92
0
0
09 Feb 2025
BCQ: Block Clustered Quantization for 4-bit (W4A4) LLM Inference
Reena Elangovan
Charbel Sakr
A. Raghunathan
Brucek Khailany
MQ
123
1
0
07 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
94
1
0
05 Feb 2025
Position: AI Scaling: From Up to Down and Out
Position: AI Scaling: From Up to Down and Out
Yunke Wang
Yanxi Li
Chang Xu
HAI
237
1
0
02 Feb 2025
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits
Xiaolin Li
Binhua Huang
B. Cardiff
Deepu John
71
0
0
31 Jan 2025
Brain network science modelling of sparse neural networks enables Transformers and LLMs to perform as fully connected
Brain network science modelling of sparse neural networks enables Transformers and LLMs to perform as fully connected
Yingtao Zhang
Diego Cerretti
Jialin Zhao
Wenjing Wu
Ziheng Liao
Umberto Michieli
C. Cannistraci
165
1
0
31 Jan 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
150
7
0
28 Jan 2025
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
Sungbin Kim
Hyunwuk Lee
Wonho Cho
Mincheol Park
Won Woo Ro
160
1
0
20 Jan 2025
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
163
1
0
17 Jan 2025
Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search
Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search
Daniel de Souza Severo
Giuseppe Ottaviano
Matthew Muckley
Karen Ullrich
Matthijs Douze
MQ
107
1
0
16 Jan 2025
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
130
2
0
10 Jan 2025
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedMLMoMe
129
25
0
08 Jan 2025
EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
Hongyu Chen
Weiming Zeng
Chong Chen
Luhui Cai
Fei Wang
...
Wei Zhang
Yuchen Li
Hongjie Yan
W. Siok
Nizhuan Wang
87
0
0
08 Jan 2025
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies
Xubin Wang
Weijia Jia
181
2
0
08 Jan 2025
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
Assaf Lahiany
Yehudit Aperstein
93
4
0
07 Jan 2025
A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks
Rasa Khosrowshahli
Shahryar Rahnamayan
Beatrice Ombuki-Berman
MQ
80
1
0
06 Jan 2025
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Zhen Li
Yupeng Su
Runming Yang
C. Xie
Zehua Wang
Zhongwei Xie
Ngai Wong
Hongxia Yang
MQLRM
186
4
0
06 Jan 2025
Pruning-based Data Selection and Network Fusion for Efficient Deep Learning
Humaira Kousar
Hasnain Irshad Bhatti
Jaekyun Moon
105
1
0
03 Jan 2025
SlimGPT: Layer-wise Structured Pruning for Large Language Models
SlimGPT: Layer-wise Structured Pruning for Large Language Models
Gui Ling
Ziyang Wang
Yuliang Yan
Qingwen Liu
106
10
0
24 Dec 2024
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
Lixian Jing
Jianpeng Qi
Junyu Dong
Yanwei Yu
3DPCAI4CE
83
0
0
24 Dec 2024
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Xing Mei
Lean Fu
MQ
135
0
0
23 Dec 2024
Lightweight Design and Optimization methods for DCNNs: Progress and
  Futures
Lightweight Design and Optimization methods for DCNNs: Progress and Futures
Hanhua Long
Wenbin Bi
Jian Sun
112
0
0
22 Dec 2024
Rethinking Model Redundancy for Low-light Image Enhancement
Rethinking Model Redundancy for Low-light Image Enhancement
Tong Li
Lizhi Wang
Hansen Feng
Lin Zhu
Wanxuan Lu
Hua Huang
121
0
0
21 Dec 2024
Holistic Adversarially Robust Pruning
Holistic Adversarially Robust Pruning
Qi Zhao
Christian Wressnegger
133
10
0
19 Dec 2024
Previous
123456...686970
Next