Pruning and Quantization for Deep Neural Network Acceleration: A Survey

24 January 2021

Papers citing "Pruning and Quantization for Deep Neural Network Acceleration: A Survey"

50 / 202 papers shown

Title
Survey on Evolutionary Deep Learning: Principles, Algorithms, Applications and Open Issues Nan Li Lianbo Ma Guo-Ding Yu Bing Xue Mengjie Zhang Yaochu Jin 27 70 0 23 Aug 2022
Efficient High-Resolution Deep Learning: A Survey Arian Bakhtiarnia Qi Zhang Alexandros Iosifidis MedIm 21 19 0 26 Jul 2022
Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach Jiseok Youn Jaehun Song Hyung-Sin Kim S. Bahk MQ 9 8 0 20 Jul 2022
DASS: Differentiable Architecture Search for Sparse neural networks H. Mousavi Mohammad Loni Mina Alibeigi Masoud Daneshtalab 38 9 0 14 Jul 2022
CPrune: Compiler-Informed Model Pruning for Efficient Target-Aware DNN Execution Taeho Kim Yongin Kwon Jemin Lee Taeho Kim Sangtae Ha 27 2 0 04 Jul 2022
Automatic autism spectrum disorder detection using artificial intelligence methods with MRI neuroimaging: A review Parisa Moridian Navid Ghassemi M. Jafari S. Salloum-Asfar Delaram Sadeghi ... A. Subasi R. Alizadehsani Juan M Gorriz Sara A. Abdulla U. Acharya 8 74 0 20 Jun 2022
Why Quantization Improves Generalization: NTK of Binary Weight Neural Networks Kaiqi Zhang Ming Yin Yu-Xiang Wang MQ 24 4 0 13 Jun 2022
Differentially Private Model Compression Fatemehsadat Mireshghallah A. Backurs Huseyin A. Inan Lukas Wutschitz Janardhan Kulkarni SyDa 21 13 0 03 Jun 2022
Deep Learning Workload Scheduling in GPU Datacenters: Taxonomy, Challenges and Vision Wei Gao Qi Hu Zhisheng Ye Peng Sun Xiaolin Wang Yingwei Luo Tianwei Zhang Yonggang Wen 83 26 0 24 May 2022
Dynamic Split Computing for Efficient Deep Edge Intelligence Arian Bakhtiarnia Nemanja Milošević Qi Zhang Dragana Bajović Alexandros Iosifidis 25 24 0 23 May 2022
What Do Compressed Multilingual Machine Translation Models Forget? Alireza Mohammadshahi Vassilina Nikoulina Alexandre Berard Caroline Brun James Henderson Laurent Besacier AI4CE 42 9 0 22 May 2022
Perturbation of Deep Autoencoder Weights for Model Compression and Classification of Tabular Data Manar D. Samad Sakib Abrar 22 12 0 17 May 2022
Automation Slicing and Testing for in-App Deep Learning Models Hao Wu Yuhang Gong Xiaopeng Ke Hanzhong Liang Minghao Li Fengyuan Xu Yunxin Liu Sheng Zhong 49 1 0 15 May 2022
A Comprehensive Survey on Model Quantization for Deep Neural Networks in Image Classification Babak Rokh A. Azarpeyvand Alireza Khanteymoori MQ 30 82 0 14 May 2022
A Survey on AI Sustainability: Emerging Trends on Learning Algorithms and Research Challenges Zhenghua Chen Min-man Wu Alvin Chan Xiaoli Li Yew-Soon Ong 19 6 0 08 May 2022
Resource-efficient domain adaptive pre-training for medical images Y. Mehmood U. I. Bajwa Xianfang Sun 14 1 0 28 Apr 2022
Boosting Pruned Networks with Linear Over-parameterization Yundi Qian Siyuan Pan Xiaoshuang Li Jie Zhang Liang Hou Xiaobing Tu 11 2 0 25 Apr 2022
MIME: Adapting a Single Neural Network for Multi-task Inference with Memory-efficient Dynamic Pruning Abhiroop Bhattacharjee Yeshwanth Venkatesha Abhishek Moitra Priyadarshini Panda 19 6 0 11 Apr 2022
Deep neural network goes lighter: A case study of deep compression techniques on automatic RF modulation recognition for Beyond 5G networks Anu Jagannath Jithin Jagannath Yanzhi Wang Tommaso Melodia 23 3 0 09 Apr 2022
Enabling All In-Edge Deep Learning: A Literature Review Praveen Joshi Mohammed Hasanuzzaman Chandra Thapa Haithem Afli T. Scully 31 22 0 07 Apr 2022
Optimizing the Consumption of Spiking Neural Networks with Activity Regularization Simon Narduzzi Siavash Bigdeli Shih-Chii Liu L. A. Dunbar 18 8 0 04 Apr 2022
Adaptive and Cascaded Compressive Sensing Chenxi Qiu Tao Yue Xue-mei Hu 25 2 0 21 Mar 2022
Hardware Approximate Techniques for Deep Neural Network Accelerators: A Survey Giorgos Armeniakos Georgios Zervakis Dimitrios Soudris J. Henkel 211 93 0 16 Mar 2022
Improvements to Gradient Descent Methods for Quantum Tensor Network Machine Learning F. Barratt J. Dborin L. Wright 16 11 0 03 Mar 2022
F8Net: Fixed-Point 8-bit Only Multiplication for Network Quantization Qing Jin Jian Ren Richard Zhuang Sumant Hanumante Zhengang Li Zhiyu Chen Yanzhi Wang Kai-Min Yang Sergey Tulyakov MQ 24 48 0 10 Feb 2022
Quantization in Layer's Input is Matter Daning Cheng Wenguang Chen MQ 11 0 0 10 Feb 2022
Local Feature Matching with Transformers for low-end devices Kyrylo Kolodiazhnyi 16 0 0 01 Feb 2022
COIN++: Neural Compression Across Modalities Emilien Dupont H. Loya Milad Alizadeh Adam Goliñski Yee Whye Teh Arnaud Doucet 53 82 0 30 Jan 2022
Object Detection in Autonomous Vehicles: Status and Open Challenges Abhishek Balasubramaniam S. Pasricha 47 54 0 19 Jan 2022
The Effect of Model Compression on Fairness in Facial Expression Recognition Samuil Stoychev Hatice Gunes CVBM 30 19 0 05 Jan 2022
Accurate Neural Training with 4-bit Matrix Multiplications at Standard Formats Brian Chmiel Ron Banner Elad Hoffer Hilla Ben Yaacov Daniel Soudry MQ 27 22 0 19 Dec 2021
A Survey on Green Deep Learning Jingjing Xu Wangchunshu Zhou Zhiyi Fu Hao Zhou Lei Li VLM 73 83 0 08 Nov 2021
When in Doubt, Summon the Titans: Efficient Inference with Large Models A. S. Rawat Manzil Zaheer A. Menon Amr Ahmed Sanjiv Kumar 19 7 0 19 Oct 2021
BERMo: What can BERT learn from ELMo? Sangamesh Kodge Kaushik Roy 38 3 0 18 Oct 2021
Training Deep Neural Networks with Joint Quantization and Pruning of Weights and Activations Xinyu Zhang Ian Colbert Ken Kreutz-Delgado Srinjoy Das MQ 32 11 0 15 Oct 2021
Shifting Capsule Networks from the Cloud to the Deep Edge Miguel Costa Diogo Costa T. Gomes Sandro Pinto 21 5 0 06 Oct 2021
$On the Compression of Neural Networks Using $\ell_0$-Norm Regularization and Weight Pruning$ On the Compression of Neural Networks Using $\ell_0$ -Norm Regularization and Weight Pruning F. Oliveira E. Batista R. Seara 12 9 0 10 Sep 2021
Juvenile state hypothesis: What we can learn from lottery ticket hypothesis researches? Di Zhang 23 1 0 08 Sep 2021
How much pre-training is enough to discover a good subnetwork? Cameron R. Wolfe Fangshuo Liao Qihan Wang J. Kim Anastasios Kyrillidis 30 3 0 31 Jul 2021
Experiments on Properties of Hidden Structures of Sparse Neural Networks Julian Stier Harsh Darji Michael Granitzer 16 2 0 27 Jul 2021
A Lightweight and Gradient-Stable Neural Layer Yueyao Yu Yin Zhang 29 0 0 08 Jun 2021
Model Compression Arhum Ishtiaq Sara Mahmood M. Anees Neha Mumtaz 15 0 0 20 May 2021
The Untapped Potential of Off-the-Shelf Convolutional Neural Networks Matthew J. Inkawhich Nathan Inkawhich Eric K. Davis H. Li Yiran Chen BDL 18 0 0 17 Mar 2021
Deep Model Compression based on the Training History S. H. Shabbeer Basha M. Farazuddin Viswanath Pulabaigari S. Dubey Snehasis Mukherjee VLM 16 17 0 30 Jan 2021
DAIS: Automatic Channel Pruning via Differentiable Annealing Indicator Search Yushuo Guan Ning Liu Pengyu Zhao Zhengping Che Kaigui Bian Yanzhi Wang Jian Tang 20 38 0 04 Nov 2020
Group Sparsity: The Hinge Between Filter Pruning and Decomposition for Network Compression Yawei Li Shuhang Gu Christoph Mayer Luc Van Gool Radu Timofte 137 189 0 19 Mar 2020
What is the State of Neural Network Pruning? Davis W. Blalock Jose Javier Gonzalez Ortiz Jonathan Frankle John Guttag 191 1,027 0 06 Mar 2020
Forward and Backward Information Retention for Accurate Binary Neural Networks Haotong Qin Ruihao Gong Xianglong Liu Mingzhu Shen Ziran Wei F. Yu Jingkuan Song MQ 131 324 0 24 Sep 2019
Training High-Performance and Large-Scale Deep Neural Networks with Full 8-bit Integers Yukuan Yang Shuang Wu Lei Deng Tianyi Yan Yuan Xie Guoqi Li MQ 99 110 0 05 Sep 2019
MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications Andrew G. Howard Menglong Zhu Bo Chen Dmitry Kalenichenko Weijun Wang Tobias Weyand M. Andreetto Hartwig Adam 3DH 950 20,567 0 17 Apr 2017