ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXivPDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,448 papers shown
Title
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Vishnu Asutosh Dasu
Md. Rafi Ur Rashid
Vipul Gupta
Saeid Tizpaz-Niari
Gang Tan
AAML
54
0
0
20 Mar 2025
PARQ: Piecewise-Affine Regularized Quantization
PARQ: Piecewise-Affine Regularized Quantization
Lisa Jin
Jianhao Ma
Zechun Liu
Andrey Gromov
Aaron Defazio
Lin Xiao
MQ
43
0
0
19 Mar 2025
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Decision Tree Induction Through LLMs via Semantically-Aware Evolution
Tennison Liu
Nicolas Huynh
M. Schaar
63
0
0
18 Mar 2025
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
Knowledge Distillation: Enhancing Neural Network Compression with Integrated Gradients
David E. Hernandez
J. Chang
Torbjörn E. M. Nordling
63
0
0
17 Mar 2025
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Changing Base Without Losing Pace: A GPU-Efficient Alternative to MatMul in DNNs
Nir Ailon
Akhiad Bercovich
Omri Weinstein
65
0
0
15 Mar 2025
Safe Vision-Language Models via Unsafe Weights Manipulation
Safe Vision-Language Models via Unsafe Weights Manipulation
Moreno DÍncà
E. Peruzzo
Xingqian Xu
Humphrey Shi
N. Sebe
Massimiliano Mancini
MU
68
0
0
14 Mar 2025
Stabilizing Quantization-Aware Training by Implicit-Regularization on Hessian Matrix
Junbiao Pang
Tianyang Cai
49
1
0
14 Mar 2025
Towards Extreme Pruning of LLMs with Plug-and-Play Mixed Sparsity
Chi Xu
Gefei Zhang
Yantong Zhu
Luca Benini
Guosheng Hu
Yawei Li
Zhihong Zhang
39
0
0
14 Mar 2025
ViM-VQ: Efficient Post-Training Vector Quantization for Visual Mamba
Juncan Deng
Shuaiting Li
Zeyu Wang
Kedong Xu
Hong Gu
Kejie Huang
MQ
65
0
0
12 Mar 2025
Residual Learning and Filtering Networks for End-to-End Lossless Video Compression
Md Baharul Islam
Afsana Ahsan Jeny
58
0
0
11 Mar 2025
SSVQ: Unleashing the Potential of Vector Quantization with Sign-Splitting
Shuaiting Li
Juncan Deng
Chenxuan Wang
Kedong Xu
Rongtao Deng
Hong Gu
Haibin Shen
Kejie Huang
MQ
71
0
0
11 Mar 2025
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
Empowering Edge Intelligence: A Comprehensive Survey on On-Device AI Models
Xubin Wang
Zhiqing Tang
Jianxiong Guo
Tianhui Meng
Chenhao Wang
Tian-sheng Wang
Weijia Jia
65
1
0
08 Mar 2025
Sample-aware Adaptive Structured Pruning for Large Language Models
Jun Kong
Xinge Ma
Jin Wang
Xuejie Zhang
56
0
0
08 Mar 2025
Personalized Federated Fine-tuning for Heterogeneous Data: An Automatic Rank Learning Approach via Two-Level LoRA
Jie Hao
Yuman Wu
Ali Payani
Myungjin Lee
Mingrui Liu
47
1
0
05 Mar 2025
FairSense-AI: Responsible AI Meets Sustainability
Shaina Raza
Mukund Sayeeganesh Chettiar
Matin Yousefabadi
Tahniat Khan
Marcelo Lotif
53
0
0
04 Mar 2025
Privacy-preserving Machine Learning in Internet of Vehicle Applications: Fundamentals, Recent Advances, and Future Direction
Nazmul Islam
Mohammad Zulkernine
50
0
0
03 Mar 2025
Eau De QQQ-Network: Adaptive Distillation of Neural Networks in Deep Reinforcement Learning
Théo Vincent
Tim Lukas Faust
Yogesh Tripathi
Jan Peters
Carlo DÉramo
47
0
0
03 Mar 2025
RSQ: Learning from Important Tokens Leads to Better Quantized LLMs
Yi-Lin Sung
Prateek Yadav
Jialu Li
Jaehong Yoon
Joey Tianyi Zhou
MQ
59
1
0
03 Mar 2025
Mamba base PKD for efficient knowledge compression
José Medina
Amnir Hadachi
Paul Honeine
Abdelaziz Bensrhair
Mamba
64
0
0
03 Mar 2025
Split Adaptation for Pre-trained Vision Transformers
Lixu Wang
Bingqi Shang
Yuchen Li
Payal Mohapatra
Wei Dong
Xiao-Xu Wang
Qi Zhu
ViT
57
0
0
01 Mar 2025
AgroLLM: Connecting Farmers and Agricultural Practices through Large Language Models for Enhanced Knowledge Transfer and Practical Application
Dinesh Jackson Samuel
Inna Skarga-Bandurova
David Sikolia
Muhammad Awais
55
0
0
28 Feb 2025
MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning
MobiLLM: Enabling LLM Fine-Tuning on the Mobile Device via Server Assisted Side Tuning
Liang Li
Xingke Yang
Wen Wu
Hao Wang
Tomoaki Ohtsuki
Xin Fu
Miao Pan
Xuemin Shen
44
1
0
27 Feb 2025
Binary Neural Networks for Large Language Model: A Survey
Binary Neural Networks for Large Language Model: A Survey
Liangdong Liu
Zhitong Zheng
Cong Wang
TianHuang Su
ZhenYu Yang
MQ
70
0
0
26 Feb 2025
Mixtraining: A Better Trade-Off Between Compute and Performance
Mixtraining: A Better Trade-Off Between Compute and Performance
Zexin Li
Jiancheng Zhang
Yufei Li
Yinglun Zhu
Cong Liu
53
0
0
26 Feb 2025
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
When Compression Meets Model Compression: Memory-Efficient Double Compression for Large Language Models
Weilan Wang
Yu Mao
Dongdong Tang
Hongchao Du
Nan Guan
Chun Jason Xue
MQ
67
1
0
24 Feb 2025
Optimizing Singular Spectrum for Large Language Model Compression
Dengjie Li
Tiancheng Shen
Yao Zhou
Baisong Yang
Zhongying Liu
Masheng Yang
Guohao Li
Yibo Yang
Yujie Zhong
Ming-Hsuan Yang
68
0
0
24 Feb 2025
Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence
Pleno-Generation: A Scalable Generative Face Video Compression Framework with Bandwidth Intelligence
Bolin Chen
Hanwei Zhu
Shanzhi Yin
Lingyu Zhu
Jie Chen
Ru-Ling Liao
Shiqi Wang
Yan Ye
67
1
0
24 Feb 2025
More for Keys, Less for Values: Adaptive KV Cache Quantization
More for Keys, Less for Values: Adaptive KV Cache Quantization
Mohsen Hariri
Lam Nguyen
Sixu Chen
Shaochen Zhong
Qifan Wang
Xia Hu
Xiaotian Han
V. Chaudhary
MQ
50
0
0
24 Feb 2025
"Actionable Help" in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts
"Actionable Help" in Crises: A Novel Dataset and Resource-Efficient Models for Identifying Request and Offer Social Media Posts
Rabindra Lamsal
M. Read
S. Karunasekera
Muhammad Imran
36
0
0
24 Feb 2025
Machine learning and high dimensional vector search
Matthijs Douze
75
0
0
24 Feb 2025
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
Automatic Joint Structured Pruning and Quantization for Efficient Neural Network Training and Compression
Xiaoyi Qu
David Aponte
Colby R. Banbury
Daniel P. Robinson
Tianyu Ding
K. Koishida
Ilya Zharkov
Tianyi Chen
MQ
72
1
0
23 Feb 2025
Verification of Bit-Flip Attacks against Quantized Neural Networks
Verification of Bit-Flip Attacks against Quantized Neural Networks
Yedi Zhang
Lei Huang
Pengfei Gao
Fu Song
Jun Sun
Jin Song Dong
AAML
54
0
0
22 Feb 2025
FedSpaLLM: Federated Pruning of Large Language Models
FedSpaLLM: Federated Pruning of Large Language Models
Guangji Bai
Yijiang Li
Zilinghan Li
Liang Zhao
Kibaek Kim
FedML
73
4
0
20 Feb 2025
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
Boyang Zhang
Daning Cheng
Yunquan Zhang
Meiqi Tu
Fangmin Liu
Jiake Tian
49
1
0
19 Feb 2025
DSMoE: Matrix-Partitioned Experts with Dynamic Routing for Computation-Efficient Dense LLMs
Minxuan Lv
Zhenpeng Su
Leiyu Pan
Yizhe Xiong
Zijia Lin
...
Guiguang Ding
Cheng Luo
Di Zhang
Kun Gai
Songlin Hu
MoE
48
0
0
18 Feb 2025
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
GPU Memory Usage Optimization for Backward Propagation in Deep Network Training
Ding-Yong Hong
Tzu-Hsien Tsai
Ning Wang
Pangfeng Liu
Jan-Jan Wu
44
0
0
18 Feb 2025
Vision-Language Models for Edge Networks: A Comprehensive Survey
Vision-Language Models for Edge Networks: A Comprehensive Survey
Ahmed Sharshar
Latif U. Khan
Waseem Ullah
Mohsen Guizani
VLM
75
3
0
11 Feb 2025
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
69
1
0
10 Feb 2025
Kolmogorov-Arnold Fourier Networks
Kolmogorov-Arnold Fourier Networks
Jusheng Zhang
Yijia Fan
Kaitong Cai
Keze Wang
68
0
0
09 Feb 2025
BCQ: Block Clustered Quantization for 4-bit (W4A4) LLM Inference
Reena Elangovan
Charbel Sakr
A. Raghunathan
Brucek Khailany
MQ
58
1
0
07 Feb 2025
Advancing Weight and Channel Sparsification with Enhanced Saliency
Advancing Weight and Channel Sparsification with Enhanced Saliency
Xinglong Sun
Maying Shen
Hongxu Yin
Lei Mao
Pavlo Molchanov
Jose M. Alvarez
60
1
0
05 Feb 2025
Progressive Binarization with Semi-Structured Pruning for LLMs
Progressive Binarization with Semi-Structured Pruning for LLMs
Xinyu Yan
Tianao Zhang
Zhiteng Li
Yulun Zhang
MQ
54
0
0
03 Feb 2025
Position: AI Scaling: From Up to Down and Out
Position: AI Scaling: From Up to Down and Out
Yunke Wang
Yanxi Li
Chang Xu
HAI
99
2
0
02 Feb 2025
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected
Yingtao Zhang
Jialin Zhao
Wenjing Wu
Ziheng Liao
Umberto Michieli
C. Cannistraci
63
0
0
31 Jan 2025
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits
DCentNet: Decentralized Multistage Biomedical Signal Classification using Early Exits
Xiaolin Li
Binhua Huang
B. Cardiff
Deepu John
46
0
0
31 Jan 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
93
5
0
28 Jan 2025
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
Ditto: Accelerating Diffusion Model via Temporal Value Similarity
Sungbin Kim
Hyunwuk Lee
Wonho Cho
Mincheol Park
Won Woo Ro
63
1
0
20 Jan 2025
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
MOGNET: A Mux-residual quantized Network leveraging Online-Generated weights
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
75
1
0
17 Jan 2025
Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search
Lossless Compression of Vector IDs for Approximate Nearest Neighbor Search
Daniel de Souza Severo
Giuseppe Ottaviano
Matthew Muckley
Karen Ullrich
Matthijs Douze
MQ
56
0
0
16 Jan 2025
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Histogram-Equalized Quantization for logic-gated Residual Neural Networks
Van Thien Nguyen
William Guicquero
Gilles Sicard
MQ
49
2
0
10 Jan 2025
Previous
12345...676869
Next