Structured Pruning of Large Language Models

v1v2 (latest)

Structured Pruning of Large Language Models

10 October 2019

Jeremy Wohlwend

ArXiv (abs)PDF HTML

Papers citing "Structured Pruning of Large Language Models"

10 / 60 papers shown

Title
Efficient softmax approximation for GPUs Edouard Grave Armand Joulin Moustapha Cissé David Grangier Hervé Jégou 97 272 0 14 Sep 2016
Compression of Neural Machine Translation Models via Pruning A. See Minh-Thang Luong Christopher D. Manning MedIm VLM 52 220 0 29 Jun 2016
On Multiplicative Integration with Recurrent Neural Networks Yuhuai Wu Saizheng Zhang Yanzhe Zhang Yoshua Bengio Ruslan Salakhutdinov 67 156 0 21 Jun 2016
EIE: Efficient Inference Engine on Compressed Deep Neural Network Song Han Xingyu Liu Huizi Mao Jing Pu A. Pedram M. Horowitz W. Dally 127 2,461 0 04 Feb 2016
Semi-supervised Sequence Learning Andrew M. Dai Quoc V. Le SSL 132 1,234 0 04 Nov 2015
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding Song Han Huizi Mao W. Dally 3DGS 263 8,859 0 01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks Song Han Jeff Pool J. Tran W. Dally CVBM 313 6,700 0 08 Jun 2015
Distilling the Knowledge in a Neural Network Geoffrey E. Hinton Oriol Vinyals J. Dean FedML 364 19,733 0 09 Mar 2015
Compressing Deep Convolutional Networks using Vector Quantization Yunchao Gong Liu Liu Ming Yang Lubomir D. Bourdev MQ 176 1,171 0 18 Dec 2014
Do Deep Nets Really Need to be Deep? Lei Jimmy Ba R. Caruana 173 2,119 0 21 Dec 2013