ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXivPDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,448 papers shown
Title
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
70
19
0
08 Jan 2025
EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
Hongyu Chen
Weiming Zeng
Cen Chen
Luhui Cai
Fei-Yue Wang
...
Wei Zhang
Yuchen Li
Hongjie Yan
W. Siok
Nizhuan Wang
49
1
0
08 Jan 2025
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies
Xubin Wang
Weijia Jia
41
0
0
08 Jan 2025
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
Assaf Lahiany
Yehudit Aperstein
38
4
0
07 Jan 2025
A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks
Rasa Khosrowshahli
Shahryar Rahnamayan
Beatrice Ombuki-Berman
MQ
28
0
0
06 Jan 2025
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Zhen Li
Yupeng Su
Runming Yang
C. Xie
Zehua Wang
Zhongwei Xie
Ngai Wong
Hongxia Yang
MQ
LRM
61
3
0
06 Jan 2025
Pruning-based Data Selection and Network Fusion for Efficient Deep Learning
Humaira Kousar
Hasnain Irshad Bhatti
Jaekyun Moon
44
0
0
03 Jan 2025
SlimGPT: Layer-wise Structured Pruning for Large Language Models
SlimGPT: Layer-wise Structured Pruning for Large Language Models
Gui Ling
Ziyang Wang
Yuliang Yan
Qingwen Liu
38
2
0
24 Dec 2024
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using
  Reinforcement Learning and Graph Learning
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
Lixian Jing
Jianpeng Qi
Junyu Dong
Yanwei Yu
3DPC
AI4CE
54
0
0
24 Dec 2024
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Xing Mei
Lean Fu
MQ
54
0
0
23 Dec 2024
Lightweight Design and Optimization methods for DCNNs: Progress and
  Futures
Lightweight Design and Optimization methods for DCNNs: Progress and Futures
Hanhua Long
Wenbin Bi
Jian Sun
92
0
0
22 Dec 2024
Rethinking Model Redundancy for Low-light Image Enhancement
Rethinking Model Redundancy for Low-light Image Enhancement
Tong Li
Lizhi Wang
Hansen Feng
Lin Zhu
Wanxuan Lu
Hua Huang
89
0
0
21 Dec 2024
Holistic Adversarially Robust Pruning
Holistic Adversarially Robust Pruning
Qi Zhao
Christian Wressnegger
95
9
0
19 Dec 2024
RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image
  Classification
RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification
Guanwenjie Zou
Liang Yao
F. Liu
Chuanyi Zhang
Xin Li
Ning Chen
Shengxiang Xu
Jun Zhou
74
1
0
17 Dec 2024
Priority-Aware Model-Distributed Inference at Edge Networks
Priority-Aware Model-Distributed Inference at Edge Networks
Teng Li
Hulya Seferoglu
81
0
0
16 Dec 2024
Designing Semi-Structured Pruning of Graph Convolutional Networks for
  Skeleton-based Recognition
Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition
Hichem Sahbi
CVBM
79
0
0
16 Dec 2024
MOFHEI: Model Optimizing Framework for Fast and Efficient
  Homomorphically Encrypted Neural Network Inference
MOFHEI: Model Optimizing Framework for Fast and Efficient Homomorphically Encrypted Neural Network Inference
Parsa Ghazvinian
Robert Podschwadt
Prajwal Panzade
Mohammad H. Rafiei
Daniel Takabi
77
0
0
10 Dec 2024
TT-MPD: Test Time Model Pruning and Distillation
TT-MPD: Test Time Model Pruning and Distillation
Haihang Wu
Wei Wang
T. Malepathirana
Sachith Seneviratne
D. Oetomo
Saman K. Halgamuge
76
0
0
10 Dec 2024
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI
  Accelerators
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI Accelerators
Taesik Gong
F. Kawsar
Chulhong Min
80
3
0
09 Dec 2024
MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based
  Multi-Device Cascade Inference
MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
Sokratis Nikolaidis
Stylianos I. Venieris
I. Venieris
91
0
0
05 Dec 2024
Quantized and Interpretable Learning Scheme for Deep Neural Networks in
  Classification Task
Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification Task
Alireza Maleki
Mahsa Lavaei
Mohsen Bagheritabar
Salar Beigzad
Zahra Abadi
MQ
77
0
0
05 Dec 2024
CPTQuant -- A Novel Mixed Precision Post-Training Quantization
  Techniques for Large Language Models
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
Amitash Nanda
Sree Bhargavi Balija
D. Sahoo
MQ
74
0
0
03 Dec 2024
AdaScale: Dynamic Context-aware DNN Scaling via Automated Adaptation
  Loop on Mobile Devices
AdaScale: Dynamic Context-aware DNN Scaling via Automated Adaptation Loop on Mobile Devices
Yuzhan Wang
Sicong Liu
Bin Guo
Boqi Zhang
Ke Ma
Yasan Ding
Hao Luo
Yao Li
Zhiwen Yu
87
1
0
01 Dec 2024
Is Oracle Pruning the True Oracle?
Is Oracle Pruning the True Oracle?
Sicheng Feng
Keda Tao
Haoyu Wang
VLM
75
0
0
28 Nov 2024
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
Xiaowen Ma
Zhenliang Ni
Xinghao Chen
Mamba
93
2
0
26 Nov 2024
DRPruning: Efficient Large Language Model Pruning through
  Distributionally Robust Optimization
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
VLM
85
0
0
21 Nov 2024
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Andy Li
A. Durrant
Milan Markovic
Lu Yin
Georgios Leontidis
Tianlong Chen
Lu Yin
Georgios Leontidis
82
0
0
20 Nov 2024
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models
  using Soft-Thresholding Mechanism
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism
Priyansh Bhatnagar
Linfeng Wen
Mingu Kang
39
0
0
15 Nov 2024
P$^2$ Law: Scaling Law for Post-Training After Model Pruning
P2^22 Law: Scaling Law for Post-Training After Model Pruning
Xiaodong Chen
Yuxuan Hu
Jing Zhang
Xiaokang Zhang
C. Li
Hongyu Chen
Jing Zhang
34
0
0
15 Nov 2024
NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs
NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs
Ruiyang Qin
Pengyu Ren
Zheyu Yan
Liu Liu
Dancheng Liu
Amir Nassereldine
Jinjun Xiong
Kai Ni
Sharon Hu
Yiyu Shi
VLM
75
1
0
12 Nov 2024
Optimizing Traffic Signal Control using High-Dimensional State
  Representation and Efficient Deep Reinforcement Learning
Optimizing Traffic Signal Control using High-Dimensional State Representation and Efficient Deep Reinforcement Learning
Lawrence Francis
Blessed Guda
Ahmed Biyabani
27
0
0
12 Nov 2024
CULL-MT: Compression Using Language and Layer pruning for Machine
  Translation
CULL-MT: Compression Using Language and Layer pruning for Machine Translation
Pedram Rostami
M. Dousti
39
0
0
10 Nov 2024
Client Contribution Normalization for Enhanced Federated Learning
Client Contribution Normalization for Enhanced Federated Learning
Mayank Kumar Kundalwal
Anurag Saraswat
Ishan Mishra
Deepak Mishra
FedML
41
0
0
10 Nov 2024
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Neal Lawton
Aram Galstyan
Greg Ver Steeg
26
0
0
07 Nov 2024
Flashy Backdoor: Real-world Environment Backdoor Attack on SNNs with DVS
  Cameras
Flashy Backdoor: Real-world Environment Backdoor Attack on SNNs with DVS Cameras
Roberto Riaño
Gorka Abad
S. Picek
A. Urbieta
AAML
41
0
0
05 Nov 2024
Magnitude Pruning of Large Pretrained Transformer Models with a Mixture
  Gaussian Prior
Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior
Mingxuan Zhang
Y. Sun
F. Liang
41
0
0
01 Nov 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
47
0
0
01 Nov 2024
Mutual Information Preserving Neural Network Pruning
Mutual Information Preserving Neural Network Pruning
Charles Westphal
Stephen Hailes
Mirco Musolesi
59
1
0
31 Oct 2024
Offline Behavior Distillation
Offline Behavior Distillation
Shiye Lei
Sen Zhang
Dacheng Tao
OffRL
46
0
0
30 Oct 2024
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting
  and Bit Stucking
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting and Bit Stucking
Matheus Farias
H. T. Kung
MQ
30
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
262
0
0
29 Oct 2024
MultiTok: Variable-Length Tokenization for Efficient LLMs Adapted from LZW Compression
MultiTok: Variable-Length Tokenization for Efficient LLMs Adapted from LZW Compression
Noel Elias
H. Esfahanizadeh
Kaan Kale
S. Vishwanath
Muriel Médard
38
0
0
28 Oct 2024
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep
  Neural Network Inference
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
Changwoo Lee
Soo Min Kwon
Qing Qu
Hun-Seok Kim
36
0
0
28 Oct 2024
Deep Insights into Automated Optimization with Large Language Models and
  Evolutionary Algorithms
Deep Insights into Automated Optimization with Large Language Models and Evolutionary Algorithms
He Yu
Jiaheng Liu
59
2
0
28 Oct 2024
Meta-Learning for Speeding Up Large Model Inference in Decentralized
  Environments
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yuzhe Yang
Yipeng Du
Ahmad Farhan
Claudio Angione
Yue Zhao
Harry Yang
Fielding Johnston
James Buban
Patrick Colangelo
39
0
0
28 Oct 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression
  of Neural Networks
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
36
2
0
28 Oct 2024
Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware
  Neuron Management
Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Tuowei Wang
Ruwen Fan
Minxing Huang
Zixu Hao
Kun Li
Ting Cao
Youyou Lu
Yaoxue Zhang
Ju Ren
55
2
0
25 Oct 2024
LoRA-C: Parameter-Efficient Fine-Tuning of Robust CNN for IoT Devices
LoRA-C: Parameter-Efficient Fine-Tuning of Robust CNN for IoT Devices
Chuntao Ding
Xu Cao
Jianhang Xie
Linlin Fan
Shangguang Wang
Zhichao Lu
39
1
0
22 Oct 2024
Mitigating Vanishing Activations in Deep CapsNets Using Channel Pruning
Mitigating Vanishing Activations in Deep CapsNets Using Channel Pruning
Siddharth Sahu
Abdulrahman Altahhan
3DPC
MedIm
50
0
0
22 Oct 2024
How Numerical Precision Affects Mathematical Reasoning Capabilities of
  LLMs
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
Zechao Li
Liwei Wang
LRM
42
6
0
17 Oct 2024
Previous
123456...676869
Next