Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2101.09671
Cited By
Pruning and Quantization for Deep Neural Network Acceleration: A Survey
24 January 2021
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Pruning and Quantization for Deep Neural Network Acceleration: A Survey"
50 / 202 papers shown
Title
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
57
0
0
05 May 2025
A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection
Andreas Karathanasis
John Violos
I. Kompatsiaris
Symeon Papadopoulos
32
0
0
29 Apr 2025
Adaptively Pruned Spiking Neural Networks for Energy-Efficient Intracortical Neural Decoding
Francesca Rivelli
Martin Popov
Charalampos Kouzinopoulos
Guangzhi Tang
29
0
0
15 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
37
0
0
13 Apr 2025
The Effects of Grouped Structural Global Pruning of Vision Transformers on Domain Generalisation
Hamza Riaz
Alan F. Smeaton
ViT
30
0
0
05 Apr 2025
LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi
Mahsa Ardakani
Jinendra Malekar
Ramtin Zand
MQ
42
0
0
02 Apr 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
48
0
0
26 Feb 2025
Spectral Theory for Edge Pruning in Asynchronous Recurrent Graph Neural Networks
Nicolas Bessone
45
0
0
23 Feb 2025
Pruning as a Defense: Reducing Memorization in Large Language Models
Mansi Gupta
Nikhar Waghela
Sarthak Gupta
Shourya Goel
Sanjif Shanmugavelu
AAML
49
0
0
18 Feb 2025
HyperCLIP: Adapting Vision-Language models with Hypernetworks
Victor Akinwande
Mohammad Sadegh Norouzzadeh
Devin Willmott
Anna Bair
Madan Ravi Ganesh
J. Zico Kolter
CLIP
VLM
93
0
0
21 Dec 2024
A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting
Nicholas Kiefer
Arvid Weyrauch
Muhammed Öz
Achim Streit
Markus Gotz
Charlotte Debus
AI4TS
72
0
0
17 Dec 2024
Edge AI-based Radio Frequency Fingerprinting for IoT Networks
Ahmed Mohamed Hussain
Nada Abughanam
P. Papadimitratos
82
1
0
13 Dec 2024
Quantization without Tears
Minghao Fu
Hao Yu
Jie Shao
Junjie Zhou
Ke Zhu
Jianxin Wu
MQ
64
1
0
21 Nov 2024
An Edge Computing-Based Solution for Real-Time Leaf Disease Classification using Thermal Imaging
Públio Elon Correa da Silva
Jurandy Almeida
27
1
0
06 Nov 2024
Transferable polychromatic optical encoder for neural networks
Minho Choi
Jinlin Xiang
A. Wirth-Singh
Seung-Hwan Baek
Eli Shlizerman
A. Majumdar
36
1
0
05 Nov 2024
Accelerated AI Inference via Dynamic Execution Methods
Haim Barad
Jascha Achterberg
Tien Pei Chou
Jean Yu
31
0
0
30 Oct 2024
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting and Bit Stucking
Matheus Farias
H. T. Kung
MQ
22
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
154
0
0
29 Oct 2024
DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units
Liam Boyle
Julian Moosmann
Nicolas Baumann
Seonyeong Heo
Michele Magno
ObjD
45
2
0
22 Oct 2024
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications
Tailai Song
Paolo Garza
Michela Meo
Maurizio Matteo Munafò
31
1
0
21 Oct 2024
Gradient-Free Neural Network Training on the Edge
Dotan Di Castro
O. Joglekar
Shir Kozlovsky
Vladimir Tchuiev
Michal Moshkovitz
MQ
14
0
0
13 Oct 2024
ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model
Shuhao Gu
Mengdi Zhao
Bowen Zhang
Liangdong Wang
Jijie Li
Guang Liu
25
2
0
06 Oct 2024
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement
Kuan-Chen Wang
Kai-Chun Liu
Ping-Cheng Yeh
Sheng-Yu Peng
Yu Tsao
28
1
0
04 Oct 2024
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML
Matteo Carnelos
Francesco Pasti
Nicola Bellotto
23
1
0
28 Sep 2024
FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation
Thomas Pöllabauer
Ashwin Pramod
Volker Knauthe
Michael Wahl
21
1
0
18 Sep 2024
Towards certifiable AI in aviation: landscape, challenges, and opportunities
Hymalai Bello
Daniel Geißler
L. Ray
Stefan Muller-Divéky
Peter Muller
Shannon Kittrell
Mengxi Liu
Bo Zhou
Paul Lukowicz
27
1
0
13 Sep 2024
HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in resource constrained devices
Federico Nicolás Peccia
Luciano Ferreyro
Alejandro Furfaro
17
0
0
26 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
32
0
0
20 Aug 2024
Convexity-based Pruning of Speech Representation Models
Teresa Dorszewski
Lenka Tětková
Lars Kai Hansen
25
2
0
16 Aug 2024
Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks
Jaewook Lee
Yoel Park
Seulki Lee
VLM
25
1
0
07 Aug 2024
LLM as Runtime Error Handler: A Promising Pathway to Adaptive Self-Healing of Software Systems
Zhensu Sun
Haotian Zhu
Bowen Xu
Xiaoning Du
Yizhe Zhu
David Lo
27
3
0
02 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Róisín Luo
Alexandru Drimbarean
Walsh Simon
Colm O'Riordan
MQ
37
0
0
01 Aug 2024
Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns
Christos Kyrkou
36
0
0
20 Jul 2024
Automated and Holistic Co-design of Neural Networks and ASICs for Enabling In-Pixel Intelligence
Shubha R. Kharel
Prashansa Mukim
Piotr Maj
Grzegorz W. Deptuch
Shinjae Yoo
Yihui Ren
Soumyajit Mandal
38
0
0
18 Jul 2024
Enhancing Split Computing and Early Exit Applications through Predefined Sparsity
Luigi Capogrosso
Enrico Fraccaroli
Giulio Petrozziello
Francesco Setti
Samarjit Chakraborty
Franco Fummi
Marco Cristani
30
3
0
16 Jul 2024
MTL-Split: Multi-Task Learning for Edge Devices using Split Computing
Luigi Capogrosso
Enrico Fraccaroli
Samarjit Chakraborty
Franco Fummi
Marco Cristani
MoE
38
5
0
08 Jul 2024
Quantizing YOLOv7: A Comprehensive Study
Mohammadamin Baghbanbashi
Mohsen Raji
B. Ghavami
MQ
29
8
0
06 Jul 2024
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models
Heng Lu
Mehdi Alemi
Reza Rawassizadeh
36
1
0
05 Jul 2024
AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource
Wengyi Zhan
Mingbao Lin
Chia-Wen Lin
Rongrong Ji
52
2
0
05 Jul 2024
Efficient DNN-Powered Software with Fair Sparse Models
Xuanqi Gao
Weipeng Jiang
Juan Zhai
Shiqing Ma
Xiaoyu Zhang
Chao Shen
50
0
0
03 Jul 2024
From Efficient Multimodal Models to World Models: A Survey
Xinji Mai
Zeng Tao
Junxiong Lin
Haoran Wang
Yang Chang
Yanlan Kang
Yan Wang
Wenqiang Zhang
32
5
0
27 Jun 2024
On Reducing Activity with Distillation and Regularization for Energy Efficient Spiking Neural Networks
Thomas Louis
Benoit Miramond
Alain Pegatoquet
Adrien Girard
30
0
0
26 Jun 2024
EON-1: A Brain-Inspired Processor for Near-Sensor Extreme Edge Online Feature Extraction
Alexandra Dobrita
Amirreza Yousefzadeh
Simon Thorpe
K. Vadivel
Paul Detterer
...
Gert-Jan van Schaik
Mario Konijnenburg
A. Gebregiorgis
Said Hamdioui
Manolis Sifalakis
43
0
0
25 Jun 2024
AI in Space for Scientific Missions: Strategies for Minimizing Neural-Network Model Upload
Jonah Ekelund
Ricardo Vinuesa
Yuri Khotyaintsev
Pierre Henri
G. Delzanno
Stefano Markidis
30
0
0
20 Jun 2024
DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled Knowledge Distillation and Low-Rank Adaptation Fine-tuning
Ovanes Petrosian
Li Pengyi
He Yulong
Liu Jiarui
Sun Zhaoruikun
Fu Guofeng
Meng Liping
21
1
0
10 Jun 2024
Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems
Rohan R. Paleja
Michael Munje
K. Chang
Reed Jensen
Matthew C. Gombolay
39
2
0
07 Jun 2024
BMRS: Bayesian Model Reduction for Structured Pruning
Dustin Wright
Christian Igel
Raghavendra Selvan
BDL
MQ
44
0
0
03 Jun 2024
Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Pallavi Mitra
Gesina Schwalbe
Nadja Klein
AAML
36
1
0
31 May 2024
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network
Zongkai Zhang
Zidong Xu
Wenming Yang
Qingmin Liao
Jing-Hao Xue
MQ
3DV
46
1
0
27 May 2024
Robust width: A lightweight and certifiable adversarial defense
Jonathan Peck
Bart Goossens
AAML
37
1
0
24 May 2024
1
2
3
4
5
Next