Can pruning make Large Language Models more efficient?

6 October 2023

Papers citing "Can pruning make Large Language Models more efficient?"

10 / 10 papers shown

Title
Can a student Large Language Model perform as well as it's teacher? Sia Gholami Marwan Omar 46 12 0 03 Oct 2023
Reducing Transformer Depth on Demand with Structured Dropout Angela Fan Edouard Grave Armand Joulin 120 596 0 25 Sep 2019
Are Sixteen Heads Really Better than One? Paul Michel Omer Levy Graham Neubig MoE 107 1,068 0 25 May 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned Elena Voita David Talbot F. Moiseev Rico Sennrich Ivan Titov 117 1,148 0 23 May 2019
The State of Sparsity in Deep Neural Networks Trevor Gale Erich Elsen Sara Hooker 163 763 0 25 Feb 2019
To prune, or not to prune: exploring the efficacy of pruning for model compression Michael Zhu Suyog Gupta 197 1,281 0 05 Oct 2017
Rocket Launching: A Universal and Efficient Framework for Training Well-performing Light Net Guorui Zhou Ying Fan Runpeng Cui Weijie Bian Xiaoqiang Zhu Kun Gai 71 116 0 14 Aug 2017
Pruning Filters for Efficient ConvNets Hao Li Asim Kadav Igor Durdanovic H. Samet H. Graf 3DPC 195 3,705 0 31 Aug 2016
EIE: Efficient Inference Engine on Compressed Deep Neural Network Song Han Xingyu Liu Huizi Mao Jing Pu A. Pedram M. Horowitz W. Dally 129 2,461 0 04 Feb 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding Song Han Huizi Mao W. Dally 3DGS 263 8,862 0 01 Oct 2015