ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2101.09671
  4. Cited By
Pruning and Quantization for Deep Neural Network Acceleration: A Survey

Pruning and Quantization for Deep Neural Network Acceleration: A Survey

24 January 2021
Tailin Liang
C. Glossner
Lei Wang
Shaobo Shi
Xiaotong Zhang
    MQ
ArXivPDFHTML

Papers citing "Pruning and Quantization for Deep Neural Network Acceleration: A Survey"

50 / 202 papers shown
Title
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques
Sanjay Surendranath Girija
Shashank Kapoor
Lakshit Arora
Dipen Pradhan
Aman Raj
Ankit Shetgaonkar
57
0
0
05 May 2025
A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection
A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection
Andreas Karathanasis
John Violos
I. Kompatsiaris
Symeon Papadopoulos
32
0
0
29 Apr 2025
Adaptively Pruned Spiking Neural Networks for Energy-Efficient Intracortical Neural Decoding
Adaptively Pruned Spiking Neural Networks for Energy-Efficient Intracortical Neural Decoding
Francesca Rivelli
Martin Popov
Charalampos Kouzinopoulos
Guangzhi Tang
29
0
0
15 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training
Yi Hu
Jinhang Zuo
Eddie Zhang
Bob Iannucci
Carlee Joe-Wong
37
0
0
13 Apr 2025
The Effects of Grouped Structural Global Pruning of Vision Transformers on Domain Generalisation
The Effects of Grouped Structural Global Pruning of Vision Transformers on Domain Generalisation
Hamza Riaz
Alan F. Smeaton
ViT
30
0
0
05 Apr 2025
LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi
LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi
Mahsa Ardakani
Jinendra Malekar
Ramtin Zand
MQ
42
0
0
02 Apr 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging
Zongzhen Yang
Binhang Qi
Hailong Sun
Wenrui Long
Ruobing Zhao
Xiang Gao
MoMe
48
0
0
26 Feb 2025
Spectral Theory for Edge Pruning in Asynchronous Recurrent Graph Neural Networks
Spectral Theory for Edge Pruning in Asynchronous Recurrent Graph Neural Networks
Nicolas Bessone
45
0
0
23 Feb 2025
Pruning as a Defense: Reducing Memorization in Large Language Models
Pruning as a Defense: Reducing Memorization in Large Language Models
Mansi Gupta
Nikhar Waghela
Sarthak Gupta
Shourya Goel
Sanjif Shanmugavelu
AAML
49
0
0
18 Feb 2025
HyperCLIP: Adapting Vision-Language models with Hypernetworks
HyperCLIP: Adapting Vision-Language models with Hypernetworks
Victor Akinwande
Mohammad Sadegh Norouzzadeh
Devin Willmott
Anna Bair
Madan Ravi Ganesh
J. Zico Kolter
CLIP
VLM
93
0
0
21 Dec 2024
A Comparative Study of Pruning Methods in Transformer-based Time Series
  Forecasting
A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting
Nicholas Kiefer
Arvid Weyrauch
Muhammed Öz
Achim Streit
Markus Gotz
Charlotte Debus
AI4TS
72
0
0
17 Dec 2024
Edge AI-based Radio Frequency Fingerprinting for IoT Networks
Edge AI-based Radio Frequency Fingerprinting for IoT Networks
Ahmed Mohamed Hussain
Nada Abughanam
P. Papadimitratos
82
1
0
13 Dec 2024
Quantization without Tears
Quantization without Tears
Minghao Fu
Hao Yu
Jie Shao
Junjie Zhou
Ke Zhu
Jianxin Wu
MQ
64
1
0
21 Nov 2024
An Edge Computing-Based Solution for Real-Time Leaf Disease
  Classification using Thermal Imaging
An Edge Computing-Based Solution for Real-Time Leaf Disease Classification using Thermal Imaging
Públio Elon Correa da Silva
Jurandy Almeida
27
1
0
06 Nov 2024
Transferable polychromatic optical encoder for neural networks
Transferable polychromatic optical encoder for neural networks
Minho Choi
Jinlin Xiang
A. Wirth-Singh
Seung-Hwan Baek
Eli Shlizerman
A. Majumdar
36
1
0
05 Nov 2024
Accelerated AI Inference via Dynamic Execution Methods
Accelerated AI Inference via Dynamic Execution Methods
Haim Barad
Jascha Achterberg
Tien Pei Chou
Jean Yu
31
0
0
30 Oct 2024
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting
  and Bit Stucking
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting and Bit Stucking
Matheus Farias
H. T. Kung
MQ
22
0
0
29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization
Data Generation for Hardware-Friendly Post-Training Quantization
Lior Dikstein
Ariel Lapid
Arnon Netzer
H. Habi
MQ
154
0
0
29 Oct 2024
DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units
DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units
Liam Boyle
Julian Moosmann
Nicolas Baumann
Seonyeong Heo
Michele Magno
ObjD
45
2
0
22 Oct 2024
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real
  Time Communications
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications
Tailai Song
Paolo Garza
Michela Meo
Maurizio Matteo Munafò
31
1
0
21 Oct 2024
Gradient-Free Neural Network Training on the Edge
Gradient-Free Neural Network Training on the Edge
Dotan Di Castro
O. Joglekar
Shir Kozlovsky
Vladimir Tchuiev
Michal Moshkovitz
MQ
14
0
0
13 Oct 2024
ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large
  Language Model
ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model
Shuhao Gu
Mengdi Zhao
Bowen Zhang
Liangdong Wang
Jijie Li
Guang Liu
25
2
0
06 Oct 2024
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for
  Surface Electromyography Enhancement
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement
Kuan-Chen Wang
Kai-Chun Liu
Ping-Cheng Yeh
Sheng-Yu Peng
Yu Tsao
28
1
0
04 Oct 2024
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML
Matteo Carnelos
Francesco Pasti
Nicola Bellotto
23
1
0
28 Sep 2024
FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose
  Estimation
FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation
Thomas Pöllabauer
Ashwin Pramod
Volker Knauthe
Michael Wahl
21
1
0
18 Sep 2024
Towards certifiable AI in aviation: landscape, challenges, and
  opportunities
Towards certifiable AI in aviation: landscape, challenges, and opportunities
Hymalai Bello
Daniel Geißler
L. Ray
Stefan Muller-Divéky
Peter Muller
Shannon Kittrell
Mengxi Liu
Bo Zhou
Paul Lukowicz
27
1
0
13 Sep 2024
HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in
  resource constrained devices
HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in resource constrained devices
Federico Nicolás Peccia
Luciano Ferreyro
Alejandro Furfaro
17
0
0
26 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through
  Sparse-Dense-Sparse Mechanism
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
32
0
0
20 Aug 2024
Convexity-based Pruning of Speech Representation Models
Convexity-based Pruning of Speech Representation Models
Teresa Dorszewski
Lenka Tětková
Lars Kai Hansen
25
2
0
16 Aug 2024
Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks
Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks
Jaewook Lee
Yoel Park
Seulki Lee
VLM
25
1
0
07 Aug 2024
LLM as Runtime Error Handler: A Promising Pathway to Adaptive
  Self-Healing of Software Systems
LLM as Runtime Error Handler: A Promising Pathway to Adaptive Self-Healing of Software Systems
Zhensu Sun
Haotian Zhu
Bowen Xu
Xiaoning Du
Yizhe Zhu
David Lo
27
3
0
02 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization
Róisín Luo
Alexandru Drimbarean
Walsh Simon
Colm O'Riordan
MQ
37
0
0
01 Aug 2024
Toward Efficient Convolutional Neural Networks With Structured Ternary
  Patterns
Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns
Christos Kyrkou
36
0
0
20 Jul 2024
Automated and Holistic Co-design of Neural Networks and ASICs for
  Enabling In-Pixel Intelligence
Automated and Holistic Co-design of Neural Networks and ASICs for Enabling In-Pixel Intelligence
Shubha R. Kharel
Prashansa Mukim
Piotr Maj
Grzegorz W. Deptuch
Shinjae Yoo
Yihui Ren
Soumyajit Mandal
38
0
0
18 Jul 2024
Enhancing Split Computing and Early Exit Applications through Predefined
  Sparsity
Enhancing Split Computing and Early Exit Applications through Predefined Sparsity
Luigi Capogrosso
Enrico Fraccaroli
Giulio Petrozziello
Francesco Setti
Samarjit Chakraborty
Franco Fummi
Marco Cristani
30
3
0
16 Jul 2024
MTL-Split: Multi-Task Learning for Edge Devices using Split Computing
MTL-Split: Multi-Task Learning for Edge Devices using Split Computing
Luigi Capogrosso
Enrico Fraccaroli
Samarjit Chakraborty
Franco Fummi
Marco Cristani
MoE
38
5
0
08 Jul 2024
Quantizing YOLOv7: A Comprehensive Study
Quantizing YOLOv7: A Comprehensive Study
Mohammadamin Baghbanbashi
Mohsen Raji
B. Ghavami
MQ
29
8
0
06 Jul 2024
The Impact of Quantization and Pruning on Deep Reinforcement Learning
  Models
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models
Heng Lu
Mehdi Alemi
Reza Rawassizadeh
36
1
0
05 Jul 2024
AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource
AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource
Wengyi Zhan
Mingbao Lin
Chia-Wen Lin
Rongrong Ji
52
2
0
05 Jul 2024
Efficient DNN-Powered Software with Fair Sparse Models
Efficient DNN-Powered Software with Fair Sparse Models
Xuanqi Gao
Weipeng Jiang
Juan Zhai
Shiqing Ma
Xiaoyu Zhang
Chao Shen
50
0
0
03 Jul 2024
From Efficient Multimodal Models to World Models: A Survey
From Efficient Multimodal Models to World Models: A Survey
Xinji Mai
Zeng Tao
Junxiong Lin
Haoran Wang
Yang Chang
Yanlan Kang
Yan Wang
Wenqiang Zhang
32
5
0
27 Jun 2024
On Reducing Activity with Distillation and Regularization for Energy
  Efficient Spiking Neural Networks
On Reducing Activity with Distillation and Regularization for Energy Efficient Spiking Neural Networks
Thomas Louis
Benoit Miramond
Alain Pegatoquet
Adrien Girard
30
0
0
26 Jun 2024
EON-1: A Brain-Inspired Processor for Near-Sensor Extreme Edge Online
  Feature Extraction
EON-1: A Brain-Inspired Processor for Near-Sensor Extreme Edge Online Feature Extraction
Alexandra Dobrita
Amirreza Yousefzadeh
Simon Thorpe
K. Vadivel
Paul Detterer
...
Gert-Jan van Schaik
Mario Konijnenburg
A. Gebregiorgis
Said Hamdioui
Manolis Sifalakis
43
0
0
25 Jun 2024
AI in Space for Scientific Missions: Strategies for Minimizing
  Neural-Network Model Upload
AI in Space for Scientific Missions: Strategies for Minimizing Neural-Network Model Upload
Jonah Ekelund
Ricardo Vinuesa
Yuri Khotyaintsev
Pierre Henri
G. Delzanno
Stefano Markidis
30
0
0
20 Jun 2024
DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled
  Knowledge Distillation and Low-Rank Adaptation Fine-tuning
DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled Knowledge Distillation and Low-Rank Adaptation Fine-tuning
Ovanes Petrosian
Li Pengyi
He Yulong
Liu Jiarui
Sun Zhaoruikun
Fu Guofeng
Meng Liping
21
1
0
10 Jun 2024
Designs for Enabling Collaboration in Human-Machine Teaming via
  Interactive and Explainable Systems
Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems
Rohan R. Paleja
Michael Munje
K. Chang
Reed Jensen
Matthew C. Gombolay
39
2
0
07 Jun 2024
BMRS: Bayesian Model Reduction for Structured Pruning
BMRS: Bayesian Model Reduction for Structured Pruning
Dustin Wright
Christian Igel
Raghavendra Selvan
BDL
MQ
44
0
0
03 Jun 2024
Investigating Calibration and Corruption Robustness of Post-hoc Pruned
  Perception CNNs: An Image Classification Benchmark Study
Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Pallavi Mitra
Gesina Schwalbe
Nadja Klein
AAML
36
1
0
31 May 2024
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network
Zongkai Zhang
Zidong Xu
Wenming Yang
Qingmin Liao
Jing-Hao Xue
MQ
3DV
46
1
0
27 May 2024
Robust width: A lightweight and certifiable adversarial defense
Robust width: A lightweight and certifiable adversarial defense
Jonathan Peck
Bart Goossens
AAML
37
1
0
24 May 2024
12345
Next