Pruning and Quantization for Deep Neural Network Acceleration: A Survey

24 January 2021

Papers citing "Pruning and Quantization for Deep Neural Network Acceleration: A Survey"

50 / 202 papers shown

Title
Optimizing LLMs for Resource-Constrained Environments: A Survey of Model Compression Techniques Sanjay Surendranath Girija Shashank Kapoor Lakshit Arora Dipen Pradhan Aman Raj Ankit Shetgaonkar 57 0 0 05 May 2025
A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection Andreas Karathanasis John Violos I. Kompatsiaris Symeon Papadopoulos 32 0 0 29 Apr 2025
Adaptively Pruned Spiking Neural Networks for Energy-Efficient Intracortical Neural Decoding Francesca Rivelli Martin Popov Charalampos Kouzinopoulos Guangzhi Tang 29 0 0 15 Apr 2025
Tin-Tin: Towards Tiny Learning on Tiny Devices with Integer-based Neural Network Training Yi Hu Jinhang Zuo Eddie Zhang Bob Iannucci Carlee Joe-Wong 37 0 0 13 Apr 2025
The Effects of Grouped Structural Global Pruning of Vision Transformers on Domain Generalisation Hamza Riaz Alan F. Smeaton ViT 30 0 0 05 Apr 2025
LLMPi: Optimizing LLMs for High-Throughput on Raspberry Pi Mahsa Ardakani Jinendra Malekar Ramtin Zand MQ 42 0 0 02 Apr 2025
CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging Zongzhen Yang Binhang Qi Hailong Sun Wenrui Long Ruobing Zhao Xiang Gao MoMe 48 0 0 26 Feb 2025
Spectral Theory for Edge Pruning in Asynchronous Recurrent Graph Neural Networks Nicolas Bessone 45 0 0 23 Feb 2025
Pruning as a Defense: Reducing Memorization in Large Language Models Mansi Gupta Nikhar Waghela Sarthak Gupta Shourya Goel Sanjif Shanmugavelu AAML 49 0 0 18 Feb 2025
HyperCLIP: Adapting Vision-Language models with Hypernetworks Victor Akinwande Mohammad Sadegh Norouzzadeh Devin Willmott Anna Bair Madan Ravi Ganesh J. Zico Kolter CLIP VLM 93 0 0 21 Dec 2024
A Comparative Study of Pruning Methods in Transformer-based Time Series Forecasting Nicholas Kiefer Arvid Weyrauch Muhammed Öz Achim Streit Markus Gotz Charlotte Debus AI4TS 72 0 0 17 Dec 2024
Edge AI-based Radio Frequency Fingerprinting for IoT Networks Ahmed Mohamed Hussain Nada Abughanam P. Papadimitratos 82 1 0 13 Dec 2024
Quantization without Tears Minghao Fu Hao Yu Jie Shao Junjie Zhou Ke Zhu Jianxin Wu MQ 64 1 0 21 Nov 2024
An Edge Computing-Based Solution for Real-Time Leaf Disease Classification using Thermal Imaging Públio Elon Correa da Silva Jurandy Almeida 27 1 0 06 Nov 2024
Transferable polychromatic optical encoder for neural networks Minho Choi Jinlin Xiang A. Wirth-Singh Seung-Hwan Baek Eli Shlizerman A. Majumdar 36 1 0 05 Nov 2024
Accelerated AI Inference via Dynamic Execution Methods Haim Barad Jascha Achterberg Tien Pei Chou Jean Yu 31 0 0 30 Oct 2024
Efficient Reprogramming of Memristive Crossbars for DNNs: Weight Sorting and Bit Stucking Matheus Farias H. T. Kung MQ 22 0 0 29 Oct 2024
Data Generation for Hardware-Friendly Post-Training Quantization Lior Dikstein Ariel Lapid Arnon Netzer H. Habi MQ 154 0 0 29 Oct 2024
DSORT-MCU: Detecting Small Objects in Real-Time on Microcontroller Units Liam Boyle Julian Moosmann Nicolas Baumann Seonyeong Heo Michele Magno ObjD 45 2 0 22 Oct 2024
Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications Tailai Song Paolo Garza Michela Meo Maurizio Matteo Munafò 31 1 0 21 Oct 2024
Gradient-Free Neural Network Training on the Edge Dotan Di Castro O. Joglekar Shir Kozlovsky Vladimir Tchuiev Michal Moshkovitz MQ 14 0 0 13 Oct 2024
ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model Shuhao Gu Mengdi Zhao Bowen Zhang Liangdong Wang Jijie Li Guang Liu 25 2 0 06 Oct 2024
TrustEMG-Net: Using Representation-Masking Transformer with U-Net for Surface Electromyography Enhancement Kuan-Chen Wang Kai-Chun Liu Ping-Cheng Yeh Sheng-Yu Peng Yu Tsao 28 1 0 04 Oct 2024
MicroFlow: An Efficient Rust-Based Inference Engine for TinyML Matteo Carnelos Francesco Pasti Nicola Bellotto 23 1 0 28 Sep 2024
FAST GDRNPP: Improving the Speed of State-of-the-Art 6D Object Pose Estimation Thomas Pöllabauer Ashwin Pramod Volker Knauthe Michael Wahl 21 1 0 18 Sep 2024
Towards certifiable AI in aviation: landscape, challenges, and opportunities Hymalai Bello Daniel Geißler L. Ray Stefan Muller-Divéky Peter Muller Shannon Kittrell Mengxi Liu Bo Zhou Paul Lukowicz 27 1 0 13 Sep 2024
HAPM -- Hardware Aware Pruning Method for CNN hardware accelerators in resource constrained devices Federico Nicolás Peccia Luciano Ferreyro Alejandro Furfaro 17 0 0 26 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism Guanchen Li Xiandong Zhao Lian Liu Zeping Li Dong Li Lu Tian Jie He Ashish Sirasao E. Barsoum VLM 32 0 0 20 Aug 2024
Convexity-based Pruning of Speech Representation Models Teresa Dorszewski Lenka Tětková Lars Kai Hansen 25 2 0 16 Aug 2024
Designing Extremely Memory-Efficient CNNs for On-device Vision Tasks Jaewook Lee Yoel Park Seulki Lee VLM 25 1 0 07 Aug 2024
LLM as Runtime Error Handler: A Promising Pathway to Adaptive Self-Healing of Software Systems Zhensu Sun Haotian Zhu Bowen Xu Xiaoning Du Yizhe Zhu David Lo 27 3 0 02 Aug 2024
Reclaiming Residual Knowledge: A Novel Paradigm to Low-Bit Quantization Róisín Luo Alexandru Drimbarean Walsh Simon Colm O'Riordan MQ 37 0 0 01 Aug 2024
Toward Efficient Convolutional Neural Networks With Structured Ternary Patterns Christos Kyrkou 36 0 0 20 Jul 2024
Automated and Holistic Co-design of Neural Networks and ASICs for Enabling In-Pixel Intelligence Shubha R. Kharel Prashansa Mukim Piotr Maj Grzegorz W. Deptuch Shinjae Yoo Yihui Ren Soumyajit Mandal 38 0 0 18 Jul 2024
Enhancing Split Computing and Early Exit Applications through Predefined Sparsity Luigi Capogrosso Enrico Fraccaroli Giulio Petrozziello Francesco Setti Samarjit Chakraborty Franco Fummi Marco Cristani 30 3 0 16 Jul 2024
MTL-Split: Multi-Task Learning for Edge Devices using Split Computing Luigi Capogrosso Enrico Fraccaroli Samarjit Chakraborty Franco Fummi Marco Cristani MoE 38 5 0 08 Jul 2024
Quantizing YOLOv7: A Comprehensive Study Mohammadamin Baghbanbashi Mohsen Raji B. Ghavami MQ 29 8 0 06 Jul 2024
The Impact of Quantization and Pruning on Deep Reinforcement Learning Models Heng Lu Mehdi Alemi Reza Rawassizadeh 36 1 0 05 Jul 2024
AnySR: Realizing Image Super-Resolution as Any-Scale, Any-Resource Wengyi Zhan Mingbao Lin Chia-Wen Lin Rongrong Ji 52 2 0 05 Jul 2024
Efficient DNN-Powered Software with Fair Sparse Models Xuanqi Gao Weipeng Jiang Juan Zhai Shiqing Ma Xiaoyu Zhang Chao Shen 50 0 0 03 Jul 2024
From Efficient Multimodal Models to World Models: A Survey Xinji Mai Zeng Tao Junxiong Lin Haoran Wang Yang Chang Yanlan Kang Yan Wang Wenqiang Zhang 32 5 0 27 Jun 2024
On Reducing Activity with Distillation and Regularization for Energy Efficient Spiking Neural Networks Thomas Louis Benoit Miramond Alain Pegatoquet Adrien Girard 30 0 0 26 Jun 2024
EON-1: A Brain-Inspired Processor for Near-Sensor Extreme Edge Online Feature Extraction Alexandra Dobrita Amirreza Yousefzadeh Simon Thorpe K. Vadivel Paul Detterer ... Gert-Jan van Schaik Mario Konijnenburg A. Gebregiorgis Said Hamdioui Manolis Sifalakis 43 0 0 25 Jun 2024
AI in Space for Scientific Missions: Strategies for Minimizing Neural-Network Model Upload Jonah Ekelund Ricardo Vinuesa Yuri Khotyaintsev Pierre Henri G. Delzanno Stefano Markidis 30 0 0 20 Jun 2024
DKDL-Net: A Lightweight Bearing Fault Detection Model via Decoupled Knowledge Distillation and Low-Rank Adaptation Fine-tuning Ovanes Petrosian Li Pengyi He Yulong Liu Jiarui Sun Zhaoruikun Fu Guofeng Meng Liping 21 1 0 10 Jun 2024
Designs for Enabling Collaboration in Human-Machine Teaming via Interactive and Explainable Systems Rohan R. Paleja Michael Munje K. Chang Reed Jensen Matthew C. Gombolay 39 2 0 07 Jun 2024
BMRS: Bayesian Model Reduction for Structured Pruning Dustin Wright Christian Igel Raghavendra Selvan BDL MQ 44 0 0 03 Jun 2024
Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study Pallavi Mitra Gesina Schwalbe Nadja Klein AAML 36 1 0 31 May 2024
BDC-Occ: Binarized Deep Convolution Unit For Binarized Occupancy Network Zongkai Zhang Zidong Xu Wenming Yang Qingmin Liao Jing-Hao Xue MQ 3DV 46 1 0 27 May 2024
Robust width: A lightweight and certifiable adversarial defense Jonathan Peck Bart Goossens AAML 37 1 0 24 May 2024