Low-bit Quantization of Recurrent Neural Network Language Models Using
Alternating Direction Methods of Multipliers

Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers

29 November 2021

Xie Chen

ArXiv (abs)PDF HTML

Papers citing "Low-bit Quantization of Recurrent Neural Network Language Models Using Alternating Direction Methods of Multipliers"

19 / 19 papers shown

Title
Alternating Multi-bit Quantization for Recurrent Neural Networks Chen Xu Jianqiang Yao Zhouchen Lin Wenwu Ou Yuanbin Cao Zhirong Wang H. Zha MQ 79 116 0 01 Feb 2018
Extremely Low Bit Neural Network: Squeeze the Last Bit Out with ADMM Cong Leng Hao Li Shenghuo Zhu Rong Jin MQ 63 288 0 24 Jul 2017
Trained Ternary Quantization Chenzhuo Zhu Song Han Huizi Mao W. Dally MQ 139 1,036 0 04 Dec 2016
Quantized Neural Networks: Training Neural Networks with Low Precision Weights and Activations Itay Hubara Matthieu Courbariaux Daniel Soudry Ran El-Yaniv Yoshua Bengio MQ 155 1,867 0 22 Sep 2016
Learning Structured Sparsity in Deep Neural Networks W. Wen Chunpeng Wu Yandan Wang Yiran Chen Hai Helen Li 187 2,340 0 12 Aug 2016
Sequence-Level Knowledge Distillation Yoon Kim Alexander M. Rush 122 1,122 0 25 Jun 2016
XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks Mohammad Rastegari Vicente Ordonez Joseph Redmon Ali Farhadi MQ 175 4,369 0 16 Mar 2016
Binarized Neural Networks Itay Hubara Daniel Soudry Ran El-Yaniv MQ 202 1,348 0 08 Feb 2016
Convolutional neural networks with low-rank regularization Cheng Tai Tong Xiao Yi Zhang Xiaogang Wang E. Weinan BDL 103 462 0 19 Nov 2015
BinaryConnect: Training Deep Neural Networks with binary weights during propagations Matthieu Courbariaux Yoshua Bengio J. David MQ 212 2,992 0 02 Nov 2015
Structured Transforms for Small-Footprint Deep Learning Vikas Sindhwani Tara N. Sainath Sanjiv Kumar 68 240 0 06 Oct 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding Song Han Huizi Mao W. Dally 3DGS 263 8,859 0 01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks Song Han Jeff Pool J. Tran W. Dally CVBM 313 6,694 0 08 Jun 2015
Compressing Neural Networks with the Hashing Trick Wenlin Chen James T. Wilson Stephen Tyree Kilian Q. Weinberger Yixin Chen 163 1,191 0 19 Apr 2015
Training Binary Multilayer Neural Networks for Image Classification using Expectation Backpropagation Zhiyong Cheng Daniel Soudry Zexi Mao Zhenzhong Lan MQ 67 52 0 12 Mar 2015
Distilling the Knowledge in a Neural Network Geoffrey E. Hinton Oriol Vinyals J. Dean FedML 364 19,723 0 09 Mar 2015
Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition V. Lebedev Yaroslav Ganin M. Rakhuba Ivan Oseledets Victor Lempitsky 68 885 0 19 Dec 2014
Compressing Deep Convolutional Networks using Vector Quantization Yunchao Gong Liu Liu Ming Yang Lubomir D. Bourdev MQ 171 1,171 0 18 Dec 2014
Speeding up Convolutional Neural Networks with Low Rank Expansions Max Jaderberg Andrea Vedaldi Andrew Zisserman 132 1,465 0 15 May 2014