Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1510.00149
Cited By
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
1 October 2015
Song Han
Huizi Mao
W. Dally
3DGS
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding"
50 / 3,448 papers shown
Title
Is Quantum Optimization Ready? An Effort Towards Neural Network Compression using Adiabatic Quantum Computing
Zhehui Wanga
Benjamin Chen Ming Choonga
Tian Huang
Daniel Gerlinghoffa
Rick Siow Mong Goh
Cheng Liu
Tao Luo
5
0
0
22 May 2025
NQKV: A KV Cache Quantization Scheme Based on Normal Distribution Characteristics
Zhihang Cai
Xingjun Zhang
Zhendong Tan
Zheng Wei
MQ
0
0
0
22 May 2025
Refining Neural Activation Patterns for Layer-Level Concept Discovery in Neural Network-Based Receivers
Marko Tuononen
Duy Vu
Dani Korpi
Vesa Starck
Ville Hautamäki
17
0
0
21 May 2025
DeFTX: Denoised Sparse Fine-Tuning for Zero-Shot Cross-Lingual Transfer
Sona Elza Simon
Preethi Jyothi
VLM
14
0
0
21 May 2025
HarmonE: A Self-Adaptive Approach to Architecting Sustainable MLOps
Hiya Bhatt
Shaunak Biswas
Srinivasan Rakhunathan
Karthik Vaidhyanathan
AI4CE
17
0
0
19 May 2025
An Overview of Arithmetic Adaptations for Inference of Convolutional Neural Networks on Re-configurable Hardware
Ilkay Wunderlich
Benjamin Koch
Sven Schönfeld
33
2
0
19 May 2025
QUADS: QUAntized Distillation Framework for Efficient Speech Language Understanding
Subrata Biswas
Mohammad Nur Hossain Khan
Bashima Islam
7
0
0
19 May 2025
Optimal Client Sampling in Federated Learning with Client-Level Heterogeneous Differential Privacy
Jiahao Xu
Rui Hu
Olivera Kotevska
FedML
27
0
0
19 May 2025
Automatic Complementary Separation Pruning Toward Lightweight CNNs
David Levin
Gonen Singer
14
0
0
19 May 2025
InfiJanice: Joint Analysis and In-situ Correction Engine for Quantization-Induced Math Degradation in Large Language Models
Zhen Li
Yupeng Su
Songmiao Wang
Runming Yang
C. Xie
...
Ming Li
Jiannong Cao
Yuan Xie
Ngai Wong
Hongxia Yang
MQ
17
0
0
16 May 2025
Resource-Efficient Language Models: Quantization for Fast and Accessible Inference
Tollef Emil Jørgensen
MQ
59
0
0
13 May 2025
Efficient Unstructured Pruning of Mamba State-Space Models for Resource-Constrained Environments
Ibne Farabi Shihab
Sanjeda Akter
Anuj Sharma
Mamba
54
0
0
13 May 2025
Sparse Training from Random Initialization: Aligning Lottery Ticket Masks using Weight Symmetry
Mohammed Adnan
Rohan Jain
Ekansh Sharma
Rahul Krishnan
Yani Andrew Ioannou
64
0
0
08 May 2025
PROM: Prioritize Reduction of Multiplications Over Lower Bit-Widths for Efficient CNNs
Lukas Meiner
Jens Mehnert
Alexandru Paul Condurache
MQ
47
0
0
06 May 2025
Efficient Continual Learning in Keyword Spotting using Binary Neural Networks
Quynh Nguyen Phuong Vu
Luciano S. Martinez-Rau
Yuxuan Zhang
Nho-Duc Tran
Bengt Oelmann
Michele Magno
Sebastian Bader
CLL
47
0
0
05 May 2025
FPGA-based Acceleration for Convolutional Neural Networks: A Comprehensive Review
Junye Jiang
Yaan Zhou
Yuanhao Gong
Haoxuan Yuan
Shuanglong Liu
12
0
0
04 May 2025
Efficient Shapley Value-based Non-Uniform Pruning of Large Language Models
Chuan Sun
Han Yu
Lizhen Cui
Xiaoxiao Li
205
0
0
03 May 2025
HMI: Hierarchical Knowledge Management for Efficient Multi-Tenant Inference in Pretrained Language Models
Junxuan Zhang
Jie Wang
Haoyang Li
Lidan Shou
Ke Chen
Gang Chen
Qin Xie
Guiming Xie
Xuejian Gong
33
0
0
24 Apr 2025
BackSlash: Rate Constrained Optimized Training of Large Language Models
Jun Wu
Jiangtao Wen
Yuxing Han
39
0
0
23 Apr 2025
Efficient Adaptation of Deep Neural Networks for Semantic Segmentation in Space Applications
Leonardo Olivi
Edoardo Santero Mormile
Enzo Tartaglione
SSeg
40
0
0
22 Apr 2025
Mathematical Programming Models for Exact and Interpretable Formulation of Neural Networks
Masoud Ataei
Edrin Hasaj
Jacob Gipp
Sepideh Forouzi
29
0
0
19 Apr 2025
Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts
Leyang Li
Shilin Lu
Yan Ren
A. Kong
DiffM
59
1
0
17 Apr 2025
Collaborative Learning of On-Device Small Model and Cloud-Based Large Model: Advances and Future Directions
Chaoyue Niu
Yucheng Ding
Junhui Lu
Zhengxiang Huang
Hang Zeng
Yutong Dai
Xuezhen Tu
Chengfei Lv
Fan Wu
Guihai Chen
42
1
0
17 Apr 2025
Efficient Reasoning Models: A Survey
Sicheng Feng
Gongfan Fang
Xinyin Ma
Xinchao Wang
ReLM
LRM
202
5
0
15 Apr 2025
Mamba-Based Ensemble learning for White Blood Cell Classification
Lewis Clifton
X. Tian
D. Palasuwan
Phandee Watanaboonyongcharoen
Ponlapat Rojnuckarin
Nantheera Anantrasirichai
Mamba
59
0
0
15 Apr 2025
70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float
Tianyi Zhang
Yang Sui
Shaochen Zhong
V. Chaudhary
Xia Hu
Anshumali Shrivastava
MQ
37
1
0
15 Apr 2025
ConvShareViT: Enhancing Vision Transformers with Convolutional Attention Mechanisms for Free-Space Optical Accelerators
Riad Ibadulla
Thomas M. Chen
C. Reyes-Aldasoro
ViT
36
0
0
15 Apr 2025
CUT: Pruning Pre-Trained Multi-Task Models into Compact Models for Edge Devices
Jingxuan Zhou
Weidong Bao
Ji Wang
Zhengyi Zhong
32
0
0
14 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
42
0
0
13 Apr 2025
Can LLMs Revolutionize the Design of Explainable and Efficient TinyML Models?
Christophe El Zeinaty
W. Hamidouche
Glenn Herrou
D. Ménard
Merouane Debbah
48
0
0
13 Apr 2025
Cycle Training with Semi-Supervised Domain Adaptation: Bridging Accuracy and Efficiency for Real-Time Mobile Scene Detection
Huu-Phong Phan-Nguyen
Anh Dao
T. Nguyen
Tuan Quang
H. Tran
Tinh-Anh Nguyen-Nhu
Huy-Thach Pham
Quan Nguyen
Hoang M. Le
Quang-Vinh Dinh
44
0
0
12 Apr 2025
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
Yoojin Jung
Byung Cheol Song
AAML
VLM
MQ
46
0
0
07 Apr 2025
Optimizing Large Language Models: Metrics, Energy Efficiency, and Case Study Insights
Tahniat Khan
Soroor Motie
Sedef Akinli Kocak
Shaina Raza
MQ
46
0
0
07 Apr 2025
Hyperflows: Pruning Reveals the Importance of Weights
Eugen Barbulescu
Antonio Alexoaie
41
0
0
06 Apr 2025
Towards Understanding and Improving Refusal in Compressed Models via Mechanistic Interpretability
Vishnu Kabir Chhabra
Mohammad Mahdi Khalili
AI4CE
33
0
0
05 Apr 2025
Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning
Sanghwan Bae
Jiwoo Hong
Min Young Lee
Hanbyul Kim
Jeongyeon Nam
Donghyun Kwak
OffRL
LRM
58
4
0
04 Apr 2025
HyperRAG: Enhancing Quality-Efficiency Tradeoffs in Retrieval-Augmented Generation with Reranker KV-Cache Reuse
Yuwei An
Yihua Cheng
Seo Jin Park
Junchen Jiang
50
1
0
03 Apr 2025
MDP: Multidimensional Vision Model Pruning with Latency Constraint
Xinglong Sun
Barath Lakshmanan
Maying Shen
Shiyi Lan
Jingde Chen
Jose M. Alvarez
VLM
67
0
0
02 Apr 2025
FedPaI: Achieving Extreme Sparsity in Federated Learning via Pruning at Initialization
Haonan Wang
Ziqiang Liu
Kajimusugura Hoshino
Tuo Zhang
J. Walters
S. Crago
54
0
0
01 Apr 2025
Machine Learning-assisted High-speed Combinatorial Optimization with Ising Machines for Dynamically Changing Problems
Yohei Hamakawa
Tomoya Kashimata
Masaya Yamasaki
Kosuke Tatsumura
AI4CE
47
0
0
31 Mar 2025
Optimization of Layer Skipping and Frequency Scaling for Convolutional Neural Networks under Latency Constraint
Minh David Thao Chan
Ruoyu Zhao
Yukuan Jia
Ruiqing Mao
Sheng Zhou
51
0
0
31 Mar 2025
Boosting Large Language Models with Mask Fine-Tuning
M. Zhang
Yue Bai
Huan Wang
Yizhou Wang
Qihua Dong
Y. Fu
CLL
58
0
0
27 Mar 2025
Mobile-VideoGPT: Fast and Accurate Video Understanding Language Model
Abdelrahman M. Shaker
Muhammad Maaz
Chenhui Gou
Hamid Rezatofighi
Salman Khan
Fahad Shahbaz Khan
249
0
0
27 Mar 2025
Optimizing Multi-DNN Inference on Mobile Devices through Heterogeneous Processor Co-Execution
Yunquan Gao
Zhiguo Zhang
Praveen Kumar Donta
C. Dehury
Xiang Wang
Dusit Niyato
Qiyang Zhang
46
0
0
27 Mar 2025
An Efficient Training Algorithm for Models with Block-wise Sparsity
Ding Zhu
Zhiqun Zuo
Mohammad Mahdi Khalili
44
0
0
27 Mar 2025
Lipschitz Constant Meets Condition Number: Learning Robust and Compact Deep Neural Networks
Yangqi Feng
S. J. Lin
Baoyuan Gao
Xian Wei
AAML
78
0
0
26 Mar 2025
A Low-complexity Structured Neural Network Approach to Intelligently Realize Wideband Multi-beam Beamformers
Hansaka Aluvihare
Sivakumar Sivasankar
Xianqi Li
Arjuna Madanayake
Sirani M. Perera
83
0
0
26 Mar 2025
GIViC: Generative Implicit Video Compression
Ge Gao
Siyue Teng
Tianhao Peng
Fan Zhang
David Bull
DiffM
VGen
50
0
0
25 Mar 2025
MoST: Efficient Monarch Sparse Tuning for 3D Representation Learning
Xu Han
Yuan Tang
Jinfeng Xu
Xianzhi Li
59
0
0
24 Mar 2025
Temporal Action Detection Model Compression by Progressive Block Drop
Xiaoyong Chen
Yong Guo
Jiaming Liang
Sitong Zhuang
Runhao Zeng
Xiping Hu
60
0
0
21 Mar 2025
1
2
3
4
...
67
68
69
Next