Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior

1 November 2024

Papers citing "Magnitude Pruning of Large Pretrained Transformer Models with a Mixture Gaussian Prior"

20 / 20 papers shown

Title
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation Yixiao Li Yifan Yu Qingru Zhang Chen Liang Pengcheng He Weizhu Chen Tuo Zhao 120 75 0 20 Jun 2023
Nonlinear Sufficient Dimension Reduction with a Stochastic Neural Network Siqi Liang Y. Sun F. Liang BDL 71 11 0 09 Oct 2022
The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for Large Language Models Eldar Kurtic Daniel Fernando Campos Tuan Nguyen Elias Frantar Mark Kurtz Ben Fineran Michael Goin Dan Alistarh VLM MQ MedIm 107 126 0 14 Mar 2022
Neural Pruning via Growing Regularization Huan Wang Can Qin Yulun Zhang Y. Fu 94 146 0 16 Dec 2020
Language Models are Few-Shot Learners Tom B. Brown Benjamin Mann Nick Ryder Melanie Subbiah Jared Kaplan ... Christopher Berner Sam McCandlish Alec Radford Ilya Sutskever Dario Amodei BDL 904 42,463 0 28 May 2020
Importance Estimation for Neural Network Pruning Pavlo Molchanov Arun Mallya Stephen Tyree I. Frosio Jan Kautz 3DPC 92 885 0 25 Jun 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding Jacob Devlin Ming-Wei Chang Kenton Lee Kristina Toutanova VLM SSL SSeg 1.8K 95,324 0 11 Oct 2018
SNIP: Single-shot Network Pruning based on Connection Sensitivity Namhoon Lee Thalaiyasingam Ajanthan Philip Torr VLM 271 1,211 0 04 Oct 2018
Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization Shashi Narayan Shay B. Cohen Mirella Lapata AILaw 158 1,684 0 27 Aug 2018
Neural Network Acceptability Judgments Alex Warstadt Amanpreet Singh Samuel R. Bowman 255 1,413 0 31 May 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding Alex Jinpeng Wang Amanpreet Singh Julian Michael Felix Hill Omer Levy Samuel R. Bowman ELM 1.1K 7,201 0 20 Apr 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks Jonathan Frankle Michael Carbin 274 3,489 0 09 Mar 2018
Nearly optimal Bayesian Shrinkage for High Dimensional Regression Qifan Song F. Liang 57 79 0 24 Dec 2017
To prune, or not to prune: exploring the efficacy of pruning for model compression Michael Zhu Suyog Gupta 202 1,282 0 05 Oct 2017
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation Daniel Cer Mona T. Diab Eneko Agirre I. Lopez-Gazpio Lucia Specia 445 1,891 0 31 Jul 2017
A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference Adina Williams Nikita Nangia Samuel R. Bowman 524 4,497 0 18 Apr 2017
SQuAD: 100,000+ Questions for Machine Comprehension of Text Pranav Rajpurkar Jian Zhang Konstantin Lopyrev Percy Liang RALM 316 8,177 0 16 Jun 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding Song Han Huizi Mao W. Dally 3DGS 263 8,864 0 01 Oct 2015
Teaching Machines to Read and Comprehend Karl Moritz Hermann Tomás Kociský Edward Grefenstette L. Espeholt W. Kay Mustafa Suleyman Phil Blunsom 355 3,555 0 10 Jun 2015
Learning both Weights and Connections for Efficient Neural Networks Song Han Jeff Pool J. Tran W. Dally CVBM 316 6,709 0 08 Jun 2015