v1v2v3 (latest)

Trained Ternary Quantization

4 December 2016

Song Han

Papers citing "Trained Ternary Quantization"

50 / 508 papers shown

Title
Quantization of Fully Convolutional Networks for Accurate Biomedical Image Segmentation Xiaowei Xu Q. Lu Yu Hu Lin Yang X. S. Hu Benlin Liu Yiyu Shi MedIm 84 85 0 13 Mar 2018
Deep Neural Network Compression with Single and Multiple Level Quantization Yuhui Xu Yongzhuang Wang Aojun Zhou Weiyao Lin H. Xiong MQ 63 115 0 06 Mar 2018
An Optimal Control Approach to Deep Learning and Applications to Discrete-Weight Neural Networks Qianxiao Li Shuji Hao 96 76 0 04 Mar 2018
The History Began from AlexNet: A Comprehensive Survey on Deep Learning Approaches Md. Zahangir Alom T. Taha C. Yakopcic Stefan Westberg P. Sidike Mst Shamima Nasrin B. Van Essen A. Awwal V. Asari VLM 133 882 0 03 Mar 2018
WRPN & Apprentice: Methods for Training and Inference using Low-Precision Numerics Asit K. Mishra Debbie Marr 29 6 0 01 Mar 2018
PBGen: Partial Binarization of Deconvolution-Based Generators for Edge Intelligence Jinglan Liu Jiaxin Zhang Yukun Ding Xiaowei Xu Meng Jiang Yiyu Shi 78 4 0 26 Feb 2018
Loss-aware Weight Quantization of Deep Networks Lu Hou James T. Kwok MQ 104 127 0 23 Feb 2018
Model compression via distillation and quantization A. Polino Razvan Pascanu Dan Alistarh MQ 106 734 0 15 Feb 2018
Training and Inference with Integers in Deep Neural Networks Shuang Wu Guoqi Li F. Chen Luping Shi MQ 78 391 0 13 Feb 2018
On the Universal Approximability and Complexity Bounds of Quantized ReLU Neural Networks Yukun Ding Jinglan Liu Jinjun Xiong Yiyu Shi MQ 120 21 0 10 Feb 2018
AMC: AutoML for Model Compression and Acceleration on Mobile Devices Yihui He Ji Lin Zhijian Liu Hanrui Wang Li Li Song Han 123 1,350 0 10 Feb 2018
Effective Quantization Approaches for Recurrent Neural Networks Md. Zahangir Alom A. Moody N. Maruyama B. Van Essen T. Taha MQ 53 36 0 07 Feb 2018
Deep Versus Wide Convolutional Neural Networks for Object Recognition on Neuromorphic System Md. Zahangir Alom Theodora Josue Md Nayim Rahman Will Mitchell C. Yakopcic T. Taha 37 21 0 07 Feb 2018
Universal Deep Neural Network Compression Yoojin Choi Mostafa El-Khamy Jungwon Lee MQ 151 88 0 07 Feb 2018
Recent Advances in Efficient Computation of Deep Convolutional Neural Networks Jian Cheng Peisong Wang Gang Li Qinghao Hu Hanqing Lu 55 3 0 03 Feb 2018
Build a Compact Binary Neural Network through Bit-level Sensitivity and Data Pruning Yixing Li Fengbo Ren MQ 36 12 0 03 Feb 2018
Alternating Multi-bit Quantization for Recurrent Neural Networks Chen Xu Jianqiang Yao Zhouchen Lin Wenwu Ou Yuanbin Cao Zhirong Wang H. Zha MQ 92 116 0 01 Feb 2018
TernaryNet: Faster Deep Model Inference without GPUs for Medical 3D Segmentation using Sparse and Binary Convolutions M. Heinrich Maximilian Blendowski Ozan Oktay MedIm 66 41 0 29 Jan 2018
Stochastic Downsampling for Cost-Adjustable Inference and Improved Regularization in Convolutional Networks Jason Kuen Xiangfei Kong Zhe Lin G. Wang Jianxiong Yin Simon See Yap-Peng Tan BDL 76 25 0 29 Jan 2018
Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights Arun Mallya Dillon Davis Svetlana Lazebnik CLL 72 35 0 19 Jan 2018
BinaryRelax: A Relaxation Approach For Training Deep Neural Networks With Quantized Weights Penghang Yin Shuai Zhang J. Lyu Stanley Osher Y. Qi Jack Xin MQ 95 79 0 19 Jan 2018
Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference Benoit Jacob S. Kligys Bo Chen Menglong Zhu Matthew Tang Andrew G. Howard Hartwig Adam Dmitry Kalenichenko MQ 199 3,154 0 15 Dec 2017
StrassenNets: Deep Learning with a Multiplication Budget Michael Tschannen Aran Khanna Anima Anandkumar 63 30 0 11 Dec 2017
Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks Hardik Sharma Jongse Park Naveen Suda Liangzhen Lai Benson Chau Joo-Young Kim Vikas Chandra H. Esmaeilzadeh MQ 70 494 0 05 Dec 2017
Deep Learning for Real-Time Crime Forecasting and its Ternarization Bao Wang Penghang Yin Andrea L. Bertozzi P. Brantingham Stanley J. Osher Jack Xin AI4TS 55 85 0 23 Nov 2017
Deep Expander Networks: Efficient Deep Networks from Graph Theory Ameya Prabhu G. Varma A. Namboodiri GNN 131 72 0 23 Nov 2017
Apprentice: Using Knowledge Distillation Techniques To Improve Low-Precision Network Accuracy Asit K. Mishra Debbie Marr FedML 74 331 0 15 Nov 2017
SparCE: Sparsity aware General Purpose Core Extensions to Accelerate Deep Neural Networks Sanchari Sen Shubham Jain Swagath Venkataramani A. Raghunathan 55 30 0 07 Nov 2017
Efficient Inferencing of Compressed Deep Neural Networks Dharma Teja Vooturi Saurabh Goyal Anamitra R. Choudhury Yogish Sabharwal Ashish Verma 42 6 0 01 Nov 2017
Minimum Energy Quantized Neural Networks Bert Moons Koen Goetschalckx Nick Van Berckelaer Marian Verhelst MQ 84 124 0 01 Nov 2017
Towards Effective Low-bitwidth Convolutional Neural Networks Bohan Zhuang Chunhua Shen Mingkui Tan Lingqiao Liu Ian Reid MQ 100 234 0 01 Nov 2017
Deep Learning as a Mixed Convex-Combinatorial Optimization Problem A. Friesen Pedro M. Domingos 46 20 0 31 Oct 2017
A Survey of Model Compression and Acceleration for Deep Neural Networks Yu Cheng Duo Wang Pan Zhou Zhang Tao 166 1,101 0 23 Oct 2017
Deep Neural Network Approximation using Tensor Sketching S. Kasiviswanathan Nina Narodytska Hongxia Jin 34 9 0 21 Oct 2017
Learning Discrete Weights Using the Local Reparameterization Trick Oran Shayer Dan Levi Ethan Fetaya 83 88 0 21 Oct 2017
TensorQuant - A Simulation Toolbox for Deep Neural Network Quantization D. Loroch Norbert Wehn Franz-Josef Pfreundt J. Keuper MQ 57 23 0 13 Oct 2017
To prune, or not to prune: exploring the efficacy of pruning for model compression Michael Zhu Suyog Gupta 204 1,286 0 05 Oct 2017
WRPN: Wide Reduced-Precision Networks Asit K. Mishra Eriko Nurvitadhi Jeffrey J. Cook Debbie Marr MQ 100 267 0 04 Sep 2017
BitNet: Bit-Regularized Deep Neural Networks Aswin Raghavan Mohamed R. Amer S. Chai Graham Taylor MQ 68 10 0 16 Aug 2017
Learning Accurate Low-Bit Deep Neural Networks with Stochastic Quantization Yinpeng Dong Renkun Ni Jianguo Li Yurong Chen Jun Zhu Hang Su MQ 91 62 0 03 Aug 2017
Streaming Architecture for Large-Scale Quantized Neural Networks on an FPGA-Based Dataflow Platform Chaim Baskin Natan Liss Evgenii Zheltonozhskii A. Bronstein A. Mendelson GNN MQ 112 35 0 31 Jul 2017
Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM Cong Leng Hao Li Shenghuo Zhu Rong Jin MQ 70 288 0 24 Jul 2017
Ternary Residual Networks Abhisek Kundu K. Banerjee Naveen Mellempudi Dheevatsa Mudigere Dipankar Das Bharat Kaul Pradeep Dubey 76 8 0 15 Jul 2017
Model compression as constrained optimization, with application to neural nets. Part II: quantization M. A. Carreira-Perpiñán Yerlan Idelbayev MQ 72 37 0 13 Jul 2017
Model compression as constrained optimization, with application to neural nets. Part I: general framework Miguel Á. Carreira-Perpiñán MQ 46 32 0 05 Jul 2017
Structured Sparse Ternary Weight Coding of Deep Neural Networks for Efficient Hardware Implementations Yoonho Boo Wonyong Sung MQ 65 11 0 01 Jul 2017
Hardware-efficient on-line learning through pipelined truncated-error backpropagation in binary-state networks H. Elsayed Bruno U. Pedroni Sadique Sheik Gert Cauwenberghs 54 8 0 15 Jun 2017
YellowFin and the Art of Momentum Tuning Jian Zhang Ioannis Mitliagkas ODL 94 108 0 12 Jun 2017
Training Quantized Nets: A Deeper Understanding Hao Li Soham De Zheng Xu Christoph Studer H. Samet Tom Goldstein MQ 87 211 0 07 Jun 2017
GXNOR-Net: Training deep neural networks with ternary weights and activations without full-precision memory under a unified discretization framework Lei Deng Peng Jiao Jing Pei Zhenzhi Wu Guoqi Li MQ 97 20 0 25 May 2017