ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1510.00149
  4. Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding

1 October 2015
Song Han
Huizi Mao
W. Dally
    3DGS
ArXivPDFHTML

Papers citing "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"

50 / 3,446 papers shown
Title
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
Sona Elza Simon
Preethi Jyothi
VLM
14
0
0
21 May 2025
Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers
Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers
Marko Tuononen
Duy Vu
Dani Korpi
Vesa Starck
Ville Hautamäki
12
0
0
21 May 2025
Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy
Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy
Jiahao Xu
Rui Hu
Olivera Kotevska
FedML
26
0
0
19 May 2025
An Overview of Arithmetic Adaptations for Inference of Convolutional Neural Networks on Re-configurable Hardware
An Overview of Arithmetic Adaptations for Inference of Convolutional Neural Networks on Re-configurable Hardware
Ilkay Wunderlich
Benjamin Koch
Sven Schönfeld
19
2
0
19 May 2025
HarmonE: A Self-Adaptive Approach to Architecting Sustainable MLOps
HarmonE: A Self-Adaptive Approach to Architecting Sustainable MLOps
Hiya Bhatt
Shaunak Biswas
Srinivasan Rakhunathan
Karthik Vaidhyanathan
AI4CE
17
0
0
19 May 2025
Automatic Complementary Separation Pruning Toward Lightweight CNNs
Automatic Complementary Separation Pruning Toward Lightweight CNNs
David Levin
Gonen Singer
12
0
0
19 May 2025
QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding
QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding
Subrata Biswas
Mohammad Nur Hossain Khan
Bashima Islam
7
0
0
19 May 2025
InfiJanice: Joint Analysis and In-situ Correction Engine for Quantization-Induced Math Degradation in Large Language Models
InfiJanice: Joint Analysis and In-situ Correction Engine for Quantization-Induced Math Degradation in Large Language Models
Zhen Li
Yupeng Su
Songmiao Wang
Runming Yang
C. Xie
...
Ming Li
Jiannong Cao
Yuan Xie
Ngai Wong
Hongxia Yang
MQ
12
0
0
16 May 2025
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
54
0
0
13 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
56
0
0
13 May 2025
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan
Rohan Jain
Ekansh Sharma
Rahul Krishnan
Yani Andrew Ioannou
61
0
0
08 May 2025
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
Lukas Meiner
Jens Mehnert
Alexandru Paul Condurache
MQ
42
0
0
06 May 2025
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Quynh Nguyen Phuong Vu
Luciano S. Martinez-Rau
Yuxuan Zhang
Nho-Duc Tran
Bengt Oelmann
Michele Magno
Sebastian Bader
CLL
45
0
0
05 May 2025
FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review
FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review
Junye Jiang
Yaan Zhou
Yuanhao Gong
Haoxuan Yuan
Shuanglong Liu
2
0
0
04 May 2025
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
Xiaoxiao Li
181
0
0
03 May 2025
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
J. Zhang
Rongxiang Weng
Yiming Li
Lidan Shou
Ke Chen
Gang Chen
Qin Xie
Guiming Xie
Xuejian Gong
33
0
0
24 Apr 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
39
0
0
23 Apr 2025
Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications
Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications
Leonardo Olivi
Edoardo Santero Mormile
Enzo Tartaglione
SSeg
35
0
0
22 Apr 2025
Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Masoud Ataei
Edrin Hasaj
Jacob Gipp
Sepideh Forouzi
29
0
0
19 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
56
1
0
17 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
35
1
0
17 Apr 2025
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators
Riad Ibadulla
Thomas M. Chen
C. Reyes-Aldasoro
ViT
34
0
0
15 Apr 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
V. Chaudhary
Xia Hu
Anshumali Shrivastava
MQ
32
1
0
15 Apr 2025
Mamba-Based Ensemble learning for White Blood Cell Classification
Mamba-Based Ensemble learning for White Blood Cell Classification
Lewis Clifton
X. Tian
D. Palasuwan
Phandee Watanaboonyongcharoen
Ponlapat Rojnuckarin
Nantheera Anantrasirichai
Mamba
51
0
0
15 Apr 2025
Efficient Reasoning Models: A Survey
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
199
5
0
15 Apr 2025
CUT: Pruning Pre-Trained Multi-Task Models into Compact Models for Edge Devices
CUT: Pruning Pre-Trained Multi-Task Models into Compact Models for Edge Devices
Jingxuan Zhou
Weidong Bao
Ji Wang
Zhengyi Zhong
32
0
0
14 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
37
0
0
13 Apr 2025
Can LLMs Revolutionize the Design of Explainable and Efficient TinyML Models?
Can LLMs Revolutionize the Design of Explainable and Efficient TinyML Models?
Christophe El Zeinaty
W. Hamidouche
Glenn Herrou
D. Ménard
Merouane Debbah
43
0
0
13 Apr 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen
Anh Dao
T. Nguyen
Tuan Quang
H. Tran
Tinh-Anh Nguyen-Nhu
Huy-Thach Pham
Quan Nguyen
Hoang M. Le
Quang-Vinh Dinh
41
0
0
12 Apr 2025
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung
Byung Cheol Song
AAML
VLM
MQ
41
0
0
07 Apr 2025
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Tahniat Khan
Soroor Motie
Sedef Akinli Kocak
Shaina Raza
MQ
44
0
0
07 Apr 2025
Hyperflows: Pruning Reveals the Importance of Weights
Hyperflows: Pruning Reveals the Importance of Weights
Eugen Barbulescu
Antonio Alexoaie
36
0
0
06 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
33
0
0
05 Apr 2025
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Sanghwan Bae
Jiwoo Hong
Min Young Lee
Hanbyul Kim
Jeongyeon Nam
Donghyun Kwak
OffRL
LRM
58
4
0
04 Apr 2025
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
Yuwei An
Yihua Cheng
Seo Jin Park
Junchen Jiang
50
1
0
03 Apr 2025
MDP: Multidimensional Vision Model Pruning with Latency Constraint
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose M. Alvarez
VLM
62
0
0
02 Apr 2025
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
Haonan Wang
Zichen Liu
Kajimusugura Hoshino
Tuo Zhang
J. Walters
S. Crago
49
0
0
01 Apr 2025
Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems
Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems
Yohei Hamakawa
Tomoya Kashimata
Masaya Yamasaki
Kosuke Tatsumura
AI4CE
42
0
0
31 Mar 2025
Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint
Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint
Minh David Thao Chan
Ruoyu Zhao
Yukuan Jia
Ruiqing Mao
Sheng Zhou
51
0
0
31 Mar 2025
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Abdelrahman M. Shaker
Muhammad Maaz
Chenhui Gou
Hamid Rezatofighi
Salman Khan
Fahad Shahbaz Khan
225
0
0
27 Mar 2025
Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution
Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution
Yunquan Gao
Zhiguo Zhang
Praveen Kumar Donta
C. Dehury
Xinbing Wang
Dusit Niyato
Qiyang Zhang
46
0
0
27 Mar 2025
An Efficient Training Algorithm for Models with Block-wise Sparsity
An Efficient Training Algorithm for Models with Block-wise Sparsity
Ding Zhu
Zhiqun Zuo
Mohammad Mahdi Khalili
42
0
0
27 Mar 2025
Boosting Large Language Models with Mask Fine-Tuning
Boosting Large Language Models with Mask Fine-Tuning
M. Zhang
Yue Bai
Huan Wang
Yizhou Wang
Qihua Dong
Y. Fu
CLL
58
0
0
27 Mar 2025
Lipschitz Constant Meets Condition Number: Learning Robust and Compact Deep Neural Networks
Lipschitz Constant Meets Condition Number: Learning Robust and Compact Deep Neural Networks
Yangqi Feng
S. J. Lin
Baoyuan Gao
Xian Wei
AAML
78
0
0
26 Mar 2025
A Low-complexity Structured Neural Network Approach to Intelligently Realize Wideband Multi-beam Beamformers
A Low-complexity Structured Neural Network Approach to Intelligently Realize Wideband Multi-beam Beamformers
Hansaka Aluvihare
Sivakumar Sivasankar
Xianqi Li
Arjuna Madanayake
Sirani M. Perera
83
0
0
26 Mar 2025
GIViC: Generative Implicit Video Compression
GIViC: Generative Implicit Video Compression
Ge Gao
Siyue Teng
Tianhao Peng
Fan Zhang
David Bull
DiffM
VGen
50
0
0
25 Mar 2025
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han
Yuan Tang
Jinfeng Xu
Xianzhi Li
53
0
0
24 Mar 2025
Temporal Action Detection Model Compression by Progressive Block Drop
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen
Yong Guo
Jiaming Liang
Sitong Zhuang
Runhao Zeng
Xiping Hu
55
0
0
21 Mar 2025
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Attention Pruning: Automated Fairness Repair of Language Models via Surrogate Simulated Annealing
Vishnu Asutosh Dasu
Md. Rafi Ur Rashid
Vipul Gupta
Saeid Tizpaz-Niari
Gang Tan
AAML
54
0
0
20 Mar 2025
PARQ: Piecewise-Affine Regularized Quantization
PARQ: Piecewise-Affine Regularized Quantization
Lisa Jin
Jianhao Ma
Zechun Liu
Andrey Gromov
Aaron Defazio
Lin Xiao
MQ
43
0
0
19 Mar 2025
1234...676869
Next