ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
v1v2v3v4v5 (latest)

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXiv (abs)PDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,481 papers shown
Title
Progressive Binarization with Semi-Structured Pruning for LLMs
Progressive Binarization with Semi-Structured Pruning for LLMs
Xinyu Yan
Tianao Zhang
Zhiteng Li
Yulun Zhang
MQ
153
1
0
01 Jul 2025
Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps
Revisiting LoRA through the Lens of Parameter Redundancy: Spectral Encoding Helps
Jiashun Cheng
Aochuan Chen
Nuo Chen
Ziqi Gao
Yuhan Li
Jia Li
Fugee Tsung
27
0
0
20 Jun 2025
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
SparseLoRA: Accelerating LLM Fine-Tuning with Contextual Sparsity
Samir Khaki
Xiuyu Li
Junxian Guo
Ligeng Zhu
Chenfeng Xu
Konstantinos N. Plataniotis
Amir Yazdanbakhsh
Kurt Keutzer
Song Han
Zhijian Liu
34
0
0
19 Jun 2025
Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
Efficient and Privacy-Preserving Soft Prompt Transfer for LLMs
Xun Wang
Jing Xu
Franziska Boenisch
Michael Backes
Christopher A. Choquette-Choo
Adam Dziedzic
AAML
44
0
0
19 Jun 2025
A Real-time Endoscopic Image Denoising System
A Real-time Endoscopic Image Denoising System
Yu Xing
Shishi Huang
Meng Lv
Guo Chen
Huailiang Wang
Lingzhi Sui
15
0
0
18 Jun 2025
A Survey on World Models Grounded in Acoustic Physical Information
A Survey on World Models Grounded in Acoustic Physical Information
Xiaoliang Chen
Le Chang
Xin Yu
Yunhe Huang
Xianling Tu
SyDaAI4CE
58
0
0
16 Jun 2025
MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models
MaskPro: Linear-Space Probabilistic Learning for Strict (N:M)-Sparsity on Large Language Models
Yan Sun
Qixin Zhang
Zhiyuan Yu
Xikun Zhang
Li Shen
Dacheng Tao
38
0
0
15 Jun 2025
ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering
ReFrame: Layer Caching for Accelerated Inference in Real-Time Rendering
Lufei Liu
Tor M. Aamodt
25
0
0
14 Jun 2025
Compression Aware Certified Training
Compression Aware Certified Training
Changming Xu
Gagandeep Singh
27
0
0
13 Jun 2025
Auto-Compressing Networks
Vaggelis Dorovatas
Georgios Paraskevopoulos
Alexandros Potamianos
72
0
0
11 Jun 2025
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
SLED: A Speculative LLM Decoding Framework for Efficient Edge Serving
Xiangchen Li
Dimitrios Spatharakis
Saeid Ghafouri
Jiakun Fan
Dimitrios Nikolopoulos
Deepu John
Bo Ji
Dimitrios S. Nikolopoulos
56
0
0
11 Jun 2025
A Topological Improvement of the Overall Performance of Sparse Evolutionary Training: Motif-Based Structural Optimization of Sparse MLPs Project
A Topological Improvement of the Overall Performance of Sparse Evolutionary Training: Motif-Based Structural Optimization of Sparse MLPs Project
Xiaotian Chen
Hongyun Liu
Seyed Sahand Mohammadi Ziabari
37
0
0
10 Jun 2025
Hyperpruning: Efficient Search through Pruned Variants of Recurrent Neural Networks Leveraging Lyapunov Spectrum
Hyperpruning: Efficient Search through Pruned Variants of Recurrent Neural Networks Leveraging Lyapunov Spectrum
Caleb Zheng
Eli Shlizerman
26
0
0
09 Jun 2025
Modified K-means Algorithm with Local Optimality Guarantees
Modified K-means Algorithm with Local Optimality Guarantees
Mingyi Li
Michael R. Metel
Akiko Takeda
DRL
35
0
0
08 Jun 2025
Event Classification of Accelerometer Data for Industrial Package Monitoring with Embedded Deep Learning
Event Classification of Accelerometer Data for Industrial Package Monitoring with Embedded Deep Learning
Manon Renault
Hamoud Younes
Hugo Tessier
Ronan Le Roy
Bastien Pasdeloup
Mathieu Léonardon
38
0
0
05 Jun 2025
FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review
FPGA-Enabled Machine Learning Applications in Earth Observation: A Systematic Review
Cédric Léonard
Dirk Stober
Martin Schulz
107
0
0
04 Jun 2025
Leveraging Coordinate Momentum in SignSGD and Muon: Memory-Optimized Zero-Order
Egor Petrov
Grigoriy Evseev
Aleksey Antonov
Andrey Veprikov
Pavel Plyusnin
Nikolay Bushkov
Stanislav Moiseev
Aleksandr Beznosikov
81
0
0
04 Jun 2025
FroM: Frobenius Norm-Based Data-Free Adaptive Model Merging
FroM: Frobenius Norm-Based Data-Free Adaptive Model Merging
Zijian Li
Xiaocheng Feng
Huixin Liu
Yichong Huang
Ting Liu
Bing Qin
MoMe
75
0
0
03 Jun 2025
MUC-G4: Minimal Unsat Core-Guided Incremental Verification for Deep Neural Network Compression
Jingyang Li
Guoqiang Li
29
0
0
03 Jun 2025
The Promise of Spiking Neural Networks for Ubiquitous Computing: A Survey and New Perspectives
The Promise of Spiking Neural Networks for Ubiquitous Computing: A Survey and New Perspectives
Hemanth Sabbella
Archit Mukherjee
Thivya Kandappu
Sounak Dey
Arpan Pal
Archan Misra
Dong Ma
AI4TS
74
0
0
02 Jun 2025
Energy Considerations for Large Pretrained Neural Networks
Energy Considerations for Large Pretrained Neural Networks
Leo Mei
Mark Stamp
AI4CE
53
0
0
02 Jun 2025
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
TAH-QUANT: Effective Activation Quantization in Pipeline Parallelism over Slow Network
Guangxin He
Yuan Cao
Yutong He
Tianyi Bai
Kun Yuan
Binhang Yuan
MQ
61
0
0
02 Jun 2025
VUSA: Virtually Upscaled Systolic Array Architecture to Exploit Unstructured Sparsity in AI Acceleration
VUSA: Virtually Upscaled Systolic Array Architecture to Exploit Unstructured Sparsity in AI Acceleration
Shereef Helal
Alberto García-Ortiz
Lennart Bamberg
48
0
0
01 Jun 2025
Assortment of Attention Heads: Accelerating Federated PEFT with Head Pruning and Strategic Client Selection
Assortment of Attention Heads: Accelerating Federated PEFT with Head Pruning and Strategic Client Selection
Yeshwanth Venkatesha
Souvik Kundu
Priyadarshini Panda
33
1
0
31 May 2025
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
LittleBit: Ultra Low-Bit Quantization via Latent Factorization
Banseok Lee
Dongkyu Kim
Youngcheon You
Youngmin Kim
MQ
33
0
0
30 May 2025
Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch
Sketch Down the FLOPs: Towards Efficient Networks for Human Sketch
Aneeshan Sain
Subhajit Maity
Pinaki Nath Chowdhury
Subhadeep Koley
A. Bhunia
Yi-Zhe Song
3DH
78
0
0
29 May 2025
DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration
DenoiseRotator: Enhance Pruning Robustness for LLMs via Importance Concentration
Tianteng Gu
Bei Liu
Bo Xiao
Ke Zeng
Jiacheng Liu
Y. Qian
54
0
0
29 May 2025
Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization
Compressing Sine-Activated Low-Rank Adapters through Post-Training Quantization
Cameron Gordon
Yiping Ji
Hemanth Saratchandran
Paul Albert
Simon Lucey
MQ
65
0
0
28 May 2025
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
Zhendong Mi
Zhenglun Kong
Geng Yuan
Shaoyi Huang
56
0
0
28 May 2025
Sparsified State-Space Models are Efficient Highway Networks
Sparsified State-Space Models are Efficient Highway Networks
Woomin Song
Jihoon Tack
Sangwoo Mo
Seunghyuk Oh
Jinwoo Shin
Mamba
41
0
0
27 May 2025
Multimodal Federated Learning: A Survey through the Lens of Different FL Paradigms
Multimodal Federated Learning: A Survey through the Lens of Different FL Paradigms
Yuanzhe Peng
Jieming Bian
Lei Wang
Yin Huang
Jie Xu
26
0
0
27 May 2025
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference
WINA: Weight Informed Neuron Activation for Accelerating Large Language Model Inference
Sihan Chen
Dan Zhao
Jongwoo Ko
Colby R. Banbury
Huiping Zhuang
Luming Liang
Tianyi Chen
46
0
0
26 May 2025
GraSS: Scalable Influence Function with Sparse Gradient Compression
GraSS: Scalable Influence Function with Sparse Gradient Compression
Pingbang Hu
Joseph Melkonian
Weijing Tang
Han Zhao
Jiaqi W. Ma
TDI
283
0
0
25 May 2025
Meta Pruning via Graph Metanetworks : A Meta Learning Framework for Network Pruning
Meta Pruning via Graph Metanetworks : A Meta Learning Framework for Network Pruning
Yewei Liu
Xiyuan Wang
Muhan Zhang
DDGNN
65
0
0
24 May 2025
$μ$-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts
μμμ-MoE: Test-Time Pruning as Micro-Grained Mixture-of-Experts
T. Koike-Akino
Jing Liu
Ye Wang
MoE
38
0
0
24 May 2025
Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior
Feature Preserving Shrinkage on Bayesian Neural Networks via the R2D2 Prior
Tsai Hor Chan
Dora Yan Zhang
Guosheng Yin
Lequan Yu
UQCVBDL
35
0
0
23 May 2025
Evolving Machine Learning: A Survey
Ignacio Cabrera Martin
Subhaditya Mukherjee
Almas Baimagambetov
Joaquin Vanschoren
Nikolaos Polatidis
VLM
261
0
0
23 May 2025
NQKV: A KV Cache Quantization Scheme Based on Normal Distribution Characteristics
NQKV: A KV Cache Quantization Scheme Based on Normal Distribution Characteristics
Zhihang Cai
Xingjun Zhang
Zhendong Tan
Zheng Wei
MQ
207
0
0
22 May 2025
Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing
Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing
Zhehui Wanga
Benjamin Chen Ming Choonga
Tian Huang
Daniel Gerlinghoffa
Rick Siow Mong Goh
Cheng Liu
Tao Luo
34
0
0
22 May 2025
Extending Dataset Pruning to Object Detection: A Variance-based Approach
Ryota Yagi
VLM
60
0
0
22 May 2025
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
Sona Elza Simon
Preethi Jyothi
VLM
79
0
0
21 May 2025
Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers
Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers
Marko Tuononen
Duy Vu
Dani Korpi
Vesa Starck
Ville Hautamäki
159
0
0
21 May 2025
QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding
QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding
Subrata Biswas
Mohammad Nur Hossain Khan
Bashima Islam
56
0
0
19 May 2025
Automatic Complementary Separation Pruning Toward Lightweight CNNs
Automatic Complementary Separation Pruning Toward Lightweight CNNs
David Levin
Gonen Singer
68
0
0
19 May 2025
HarmonE: A Self-Adaptive Approach to Architecting Sustainable MLOps
HarmonE: A Self-Adaptive Approach to Architecting Sustainable MLOps
Hiya Bhatt
Shaunak Biswas
Srinivasan Rakhunathan
Karthik Vaidhyanathan
AI4CE
51
0
0
19 May 2025
Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy
Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy
Jiahao Xu
Rui Hu
Olivera Kotevska
FedML
69
0
0
19 May 2025
An Overview of Arithmetic Adaptations for Inference of Convolutional Neural Networks on Re-configurable Hardware
An Overview of Arithmetic Adaptations for Inference of Convolutional Neural Networks on Re-configurable Hardware
Ilkay Wunderlich
Benjamin Koch
Sven Schönfeld
232
2
0
19 May 2025
InfiJanice: Joint Analysis and In-situ Correction Engine for Quantization-Induced Math Degradation in Large Language Models
InfiJanice: Joint Analysis and In-situ Correction Engine for Quantization-Induced Math Degradation in Large Language Models
Zhen Li
Yupeng Su
Songmiao Wang
Runming Yang
C. Xie
...
Ming Li
Jiannong Cao
Yuan Xie
Ngai Wong
Hongxia Yang
MQ
122
0
0
16 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
103
0
0
13 May 2025
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
142
1
0
13 May 2025
1234...686970
Next