Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,448 papers shown
Title
Localize-and-Stitch: Efficient Model Merging via Sparse Task Arithmetic
Yifei He
Yuzheng Hu
Yong Lin
Tong Zhang
Han Zhao
FedML
MoMe
70
19
0
08 Jan 2025
EEG Emotion Copilot: Optimizing Lightweight LLMs for Emotional EEG Interpretation with Assisted Medical Record Generation
Hongyu Chen
Weiming Zeng
Cen Chen
Luhui Cai
Fei-Yue Wang
...
Wei Zhang
Yuchen Li
Hongjie Yan
W. Siok
Nizhuan Wang
49
1
0
08 Jan 2025
Optimizing Edge AI: A Comprehensive Survey on Data, Model, and System Strategies
Xubin Wang
Weijia Jia
41
0
0
08 Jan 2025
PTEENet: Post-Trained Early-Exit Neural Networks Augmentation for Inference Cost Optimization
Assaf Lahiany
Yehudit Aperstein
38
4
0
07 Jan 2025
A Novel Structure-Agnostic Multi-Objective Approach for Weight-Sharing Compression in Deep Neural Networks
Rasa Khosrowshahli
Shahryar Rahnamayan
Beatrice Ombuki-Berman
MQ
28
0
0
06 Jan 2025
Quantization Meets Reasoning: Exploring LLM Low-Bit Quantization Degradation for Mathematical Reasoning
Zhen Li
Yupeng Su
Runming Yang
C. Xie
Zehua Wang
Zhongwei Xie
Ngai Wong
Hongxia Yang
MQ
LRM
61
3
0
06 Jan 2025
Pruning-based Data Selection and Network Fusion for Efficient Deep Learning
Humaira Kousar
Hasnain Irshad Bhatti
Jaekyun Moon
44
0
0
03 Jan 2025
SlimGPT: Layer-wise Structured Pruning for Large Language Models
Gui Ling
Ziyang Wang
Yuliang Yan
Qingwen Liu
38
2
0
24 Dec 2024
AutoSculpt: A Pattern-based Model Auto-pruning Framework Using Reinforcement Learning and Graph Learning
Lixian Jing
Jianpeng Qi
Junyu Dong
Yanwei Yu
3DPC
AI4CE
54
0
0
24 Dec 2024
GQSA: Group Quantization and Sparsity for Accelerating Large Language Model Inference
Chao Zeng
Songwei Liu
Shu Yang
Fangmin Chen
Xing Mei
Lean Fu
MQ
54
0
0
23 Dec 2024
Lightweight Design and Optimization methods for DCNNs: Progress and Futures
Hanhua Long
Wenbin Bi
Jian Sun
92
0
0
22 Dec 2024
Rethinking Model Redundancy for Low-light Image Enhancement
Tong Li
Lizhi Wang
Hansen Feng
Lin Zhu
Wanxuan Lu
Hua Huang
89
0
0
21 Dec 2024
Holistic Adversarially Robust Pruning
Qi Zhao
Christian Wressnegger
95
9
0
19 Dec 2024
RemoteTrimmer: Adaptive Structural Pruning for Remote Sensing Image Classification
Guanwenjie Zou
Liang Yao
F. Liu
Chuanyi Zhang
Xin Li
Ning Chen
Shengxiang Xu
Jun Zhou
74
1
0
17 Dec 2024
Priority-Aware Model-Distributed Inference at Edge Networks
Teng Li
Hulya Seferoglu
81
0
0
16 Dec 2024
Designing Semi-Structured Pruning of Graph Convolutional Networks for Skeleton-based Recognition
Hichem Sahbi
CVBM
79
0
0
16 Dec 2024
MOFHEI: Model Optimizing Framework for Fast and Efficient Homomorphically Encrypted Neural Network Inference
Parsa Ghazvinian
Robert Podschwadt
Prajwal Panzade
Mohammad H. Rafiei
Daniel Takabi
77
0
0
10 Dec 2024
TT-MPD: Test Time Model Pruning and Distillation
Haihang Wu
Wei Wang
T. Malepathirana
Sachith Seneviratne
D. Oetomo
Saman K. Halgamuge
76
0
0
10 Dec 2024
DEX: Data Channel Extension for Efficient CNN Inference on Tiny AI Accelerators
Taesik Gong
F. Kawsar
Chulhong Min
80
3
0
09 Dec 2024
MultiTASC++: A Continuously Adaptive Scheduler for Edge-Based Multi-Device Cascade Inference
Sokratis Nikolaidis
Stylianos I. Venieris
I. Venieris
91
0
0
05 Dec 2024
Quantized and Interpretable Learning Scheme for Deep Neural Networks in Classification Task
Alireza Maleki
Mahsa Lavaei
Mohsen Bagheritabar
Salar Beigzad
Zahra Abadi
MQ
77
0
0
05 Dec 2024
CPTQuant -- A Novel Mixed Precision Post-Training Quantization Techniques for Large Language Models
Amitash Nanda
Sree Bhargavi Balija
D. Sahoo
MQ
74
0
0
03 Dec 2024
AdaScale: Dynamic Context-aware DNN Scaling via Automated Adaptation Loop on Mobile Devices
Yuzhan Wang
Sicong Liu
Bin Guo
Boqi Zhang
Ke Ma
Yasan Ding
Hao Luo
Yao Li
Zhiwen Yu
87
1
0
01 Dec 2024
Is Oracle Pruning the True Oracle?
Sicheng Feng
Keda Tao
Haoyu Wang
VLM
75
0
0
28 Nov 2024
TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba
Xiaowen Ma
Zhenliang Ni
Xinghao Chen
Mamba
93
2
0
26 Nov 2024
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization
Hexuan Deng
Wenxiang Jiao
Xuebo Liu
Min Zhang
Zhaopeng Tu
VLM
85
0
0
21 Nov 2024
Pushing the Limits of Sparsity: A Bag of Tricks for Extreme Pruning
Andy Li
A. Durrant
Milan Markovic
Lu Yin
Georgios Leontidis
Tianlong Chen
Lu Yin
Georgios Leontidis
82
0
0
20 Nov 2024
SoftLMs: Efficient Adaptive Low-Rank Approximation of Language Models using Soft-Thresholding Mechanism
Priyansh Bhatnagar
Linfeng Wen
Mingu Kang
39
0
0
15 Nov 2024
P
2
^2
2
Law: Scaling Law for Post-Training After Model Pruning
Xiaodong Chen
Yuxuan Hu
Jing Zhang
Xiaokang Zhang
C. Li
Hongyu Chen
Jing Zhang
34
0
0
15 Nov 2024
NVCiM-PT: An NVCiM-assisted Prompt Tuning Framework for Edge LLMs
Ruiyang Qin
Pengyu Ren
Zheyu Yan
Liu Liu
Dancheng Liu
Amir Nassereldine
Jinjun Xiong
Kai Ni
Sharon Hu
Yiyu Shi
VLM
75
1
0
12 Nov 2024
Optimizing Traffic Signal Control using High-Dimensional State Representation and Efficient Deep Reinforcement Learning
Lawrence Francis
Blessed Guda
Ahmed Biyabani
27
0
0
12 Nov 2024
CULL-MT: Compression Using Language and Layer pruning for Machine Translation
Pedram Rostami
M. Dousti
39
0
0
10 Nov 2024
Client Contribution Normalization for Enhanced Federated Learning
Mayank Kumar Kundalwal
Anurag Saraswat
Ishan Mishra
Deepak Mishra
FedML
41
0
0
10 Nov 2024
Learning Morphisms with Gauss-Newton Approximation for Growing Networks
Neal Lawton
Aram Galstyan
Greg Ver Steeg
26
0
0
07 Nov 2024
Flashy Backdoor: Real-world Environment Backdoor Attack on SNNs with DVS Cameras
Roberto Riaño
Gorka Abad
S. Picek
A. Urbieta
AAML
41
0
0
05 Nov 2024
Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior
Mingxuan Zhang
Y. Sun
F. Liang
41
0
0
01 Nov 2024
On the Impact of White-box Deployment Strategies for Edge AI on Latency and Model Performance
Jaskirat Singh
Bram Adams
Ahmed E. Hassan
VLM
47
0
0
01 Nov 2024
Mutual Information Preserving Neural Network Pruning
Charles Westphal
Stephen Hailes
Mirco Musolesi
59
1
0
31 Oct 2024
Offline Behavior Distillation
Shiye Lei
Sen Zhang
Dacheng Tao
OffRL
46
0
0
30 Oct 2024
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting and Bit Stucking
Matheus Farias
H. T. Kung
MQ
30
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
262
0
0
29 Oct 2024
MultiTok: Variable-Length Tokenization for Efficient LLMs Adapted from LZW Compression
Noel Elias
H. Esfahanizadeh
Kaan Kale
S. Vishwanath
Muriel Médard
38
0
0
28 Oct 2024
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
Changwoo Lee
Soo Min Kwon
Qing Qu
Hun-Seok Kim
36
0
0
28 Oct 2024
Deep Insights into Automated Optimization with Large Language Models and Evolutionary Algorithms
He Yu
Jiaheng Liu
59
2
0
28 Oct 2024
Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments
Yuzhe Yang
Yipeng Du
Ahmad Farhan
Claudio Angione
Yue Zhao
Harry Yang
Fielding Johnston
James Buban
Patrick Colangelo
39
0
0
28 Oct 2024
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
Yongchang Hao
Yanshuai Cao
Lili Mou
MQ
36
2
0
28 Oct 2024
Ripple: Accelerating LLM Inference on Smartphones with Correlation-Aware Neuron Management
Tuowei Wang
Ruwen Fan
Minxing Huang
Zixu Hao
Kun Li
Ting Cao
Youyou Lu
Yaoxue Zhang
Ju Ren
55
2
0
25 Oct 2024
LoRA-C: Parameter-Efficient Fine-Tuning of Robust CNN for IoT Devices
Chuntao Ding
Xu Cao
Jianhang Xie
Linlin Fan
Shangguang Wang
Zhichao Lu
39
1
0
22 Oct 2024
Mitigating Vanishing Activations in Deep CapsNets Using Channel Pruning
Siddharth Sahu
Abdulrahman Altahhan
3DPC
MedIm
50
0
0
22 Oct 2024
How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Guhao Feng
Kai-Bo Yang
Yuntian Gu
Xinyue Ai
Shengjie Luo
Jiacheng Sun
Di He
Zechao Li
Liwei Wang
LRM
42
6
0
17 Oct 2024
Previous
1
2
3
4
5
6
...
67
68
69
Next