Communities
Connect sessions
AI calendar
Organizations
Contact Sales
Search
Open menu
Home
Papers
2301.00774
Cited By
v1
v2
v3 (latest)
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
2 January 2023
Elias Frantar
Dan Alistarh
VLM
Re-assign community
ArXiv (abs)
PDF
HTML
HuggingFace (3 upvotes)
Github (799★)
Papers citing
"SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"
37 / 287 papers shown
Title
NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models
Jongwoo Ko
Seungjoon Park
Yujin Kim
Sumyeong Ahn
Du-Seong Chang
Euijai Ahn
SeYoung Yun
145
9
0
16 Oct 2023
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Yuxin Zhang
Lirui Zhao
Mingbao Lin
Yunyun Sun
Yiwu Yao
Xingjia Han
Jared Tanner
Shiwei Liu
Rongrong Ji
SyDa
177
56
0
13 Oct 2023
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Lu Yin
You Wu
Zhenyu Zhang
Cheng-Yu Hsieh
Yaqing Wang
...
Mykola Pechenizkiy
Yi Liang
Michael Bendersky
Zhangyang Wang
Shiwei Liu
241
119
0
08 Oct 2023
ECoFLaP: Efficient Coarse-to-Fine Layer-Wise Pruning for Vision-Language Models
Yi-Lin Sung
Jaehong Yoon
Mohit Bansal
VLM
119
15
0
04 Oct 2023
Sweeping Heterogeneity with Smart MoPs: Mixture of Prompts for LLM Task Adaptation
Chen Dun
Mirian Hipolito Garcia
Guoqing Zheng
Ahmed Hassan Awadallah
Anastasios Kyrillidis
Robert Sim
270
7
0
04 Oct 2023
Compressing LLMs: The Truth is Rarely Pure and Never Simple
Ajay Jaiswal
Zhe Gan
Xianzhi Du
Bowen Zhang
Zhangyang Wang
Yinfei Yang
MQ
170
56
0
02 Oct 2023
Do Compressed LLMs Forget Knowledge? An Experimental Study with Practical Implications
Duc Hoang
Minsik Cho
Thomas Merth
Mohammad Rastegari
Zhangyang Wang
KELM
CLL
147
5
0
02 Oct 2023
Junk DNA Hypothesis: Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs "Difficult" Downstream Tasks in LLMs
Lu Yin
Ajay Jaiswal
Shiwei Liu
Souvik Kundu
Zhangyang Wang
169
12
0
29 Sep 2023
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Yuhui Xu
Lingxi Xie
Xiaotao Gu
Xin Chen
Heng Chang
Hengheng Zhang
Zhensu Chen
Xiaopeng Zhang
Qi Tian
MQ
110
126
0
26 Sep 2023
LORD: Low Rank Decomposition Of Monolingual Code LLMs For One-Shot Compression
Ayush Kaushal
Tejas Vaidhya
Irina Rish
189
22
0
25 Sep 2023
Pruning Large Language Models via Accuracy Predictor
Yupeng Ji
Yibo Cao
Jiu-si Liu
KELM
117
4
0
18 Sep 2023
Norm Tweaking: High-performance Low-bit Quantization of Large Language Models
Liang Li
Qingyuan Li
Bo Zhang
Xiangxiang Chu
MQ
172
36
0
06 Sep 2023
FPTQ: Fine-grained Post-Training Quantization for Large Language Models
Qingyuan Li
Yifan Zhang
Liang Li
Peng Yao
Bo Zhang
Xiangxiang Chu
Yerui Sun
Li-Qiang Du
Yuchen Xie
MQ
150
17
0
30 Aug 2023
Ternary Singular Value Decomposition as a Better Parameterized Form in Linear Mapping
Boyu Chen
Hanxuan Chen
Jiao He
Fengyu Sun
Shangling Jui
136
3
0
15 Aug 2023
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
Seungcheol Park
Ho-Jin Choi
U. Kang
VLM
138
9
0
07 Aug 2023
Pruning vs Quantization: Which is Better?
Andrey Kuzmin
Markus Nagel
M. V. Baalen
Arash Behboodi
Tijmen Blankevoort
MQ
185
77
0
06 Jul 2023
Query Understanding in the Age of Large Language Models
Avishek Anand
Venktesh V
Abhijit Anand
Vinay Setty
LRM
135
7
0
28 Jun 2023
H
2
_2
2
O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models
Zhenyu Zhang
Ying Sheng
Wanrong Zhu
Tianlong Chen
Lianmin Zheng
...
Yuandong Tian
Christopher Ré
Clark W. Barrett
Zhangyang Wang
Beidi Chen
VLM
394
387
0
24 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
312
543
0
20 Jun 2023
The Emergence of Essential Sparsity in Large Pre-trained Models: The Weights that Matter
Ajay Jaiswal
Shiwei Liu
Tianlong Chen
Zhangyang Wang
VLM
144
39
0
06 Jun 2023
Intriguing Properties of Quantization at Scale
Arash Ahmadian
Saurabh Dash
Hongyu Chen
Bharat Venkitesh
Stephen Gou
Phil Blunsom
Ahmet Üstün
Sara Hooker
MQ
157
40
0
30 May 2023
Dynamic Context Pruning for Efficient and Interpretable Autoregressive Transformers
Sotiris Anagnostidis
Dario Pavllo
Luca Biggio
Lorenzo Noci
Aurelien Lucchi
Thomas Hofmann
198
62
0
25 May 2023
Winner-Take-All Column Row Sampling for Memory Efficient Adaptation of Language Model
Zirui Liu
Guanchu Wang
Shaochen Zhong
Zhaozhuo Xu
Daochen Zha
...
Zhimeng Jiang
Kaixiong Zhou
Vipin Chaudhary
Shuai Xu
Helen Zhou
118
19
0
24 May 2023
LLM-Pruner: On the Structural Pruning of Large Language Models
Xinyin Ma
Gongfan Fang
Xinchao Wang
312
550
0
19 May 2023
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Zhaozhuo Xu
Zirui Liu
Beidi Chen
Yuxin Tang
Jue Wang
Kaixiong Zhou
Helen Zhou
Anshumali Shrivastava
MQ
162
36
0
17 May 2023
CrAFT: Compression-Aware Fine-Tuning for Efficient Visual Task Adaptation
J. Heo
S. Azizi
A. Fayyazi
Massoud Pedram
109
1
0
08 May 2023
Towards Automated Circuit Discovery for Mechanistic Interpretability
Arthur Conmy
Augustine N. Mavor-Parker
Aengus Lynch
Stefan Heimersheim
Adrià Garriga-Alonso
194
375
0
28 Apr 2023
Sparsified Model Zoo Twins: Investigating Populations of Sparsified Neural Network Models
D. Honegger
Konstantin Schurholt
Damian Borth
132
5
0
26 Apr 2023
Towards Compute-Optimal Transfer Learning
Massimo Caccia
Alexandre Galashov
Arthur Douillard
Amal Rannen-Triki
Dushyant Rao
Michela Paganini
Laurent Charlin
MarcÁurelio Ranzato
Razvan Pascanu
109
3
0
25 Apr 2023
Blockwise Compression of Transformer-based Models without Retraining
Gaochen Dong
W. Chen
56
3
0
04 Apr 2023
RPTQ: Reorder-based Post-training Quantization for Large Language Models
Zhihang Yuan
Lin Niu
Jia-Wen Liu
Wenyu Liu
Xinggang Wang
Yuzhang Shang
Guangyu Sun
Qiang Wu
Jiaxiang Wu
Bingzhe Wu
MQ
283
99
0
03 Apr 2023
FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU
Ying Sheng
Lianmin Zheng
Binhang Yuan
Zhuohan Li
Max Ryabinin
...
Joseph E. Gonzalez
Percy Liang
Christopher Ré
Ion Stoica
Ce Zhang
295
482
0
13 Mar 2023
Streaming Kernel PCA Algorithm With Small Space
Yichuan Deng
Zhao Song
Zifan Wang
Hangke Zhang
140
4
0
08 Mar 2023
Complex QA and language models hybrid architectures, Survey
Xavier Daull
P. Bellot
Emmanuel Bruno
Vincent Martin
Elisabeth Murisasco
ELM
374
17
0
17 Feb 2023
A Comprehensive Review and a Taxonomy of Edge Machine Learning: Requirements, Paradigms, and Techniques
Wenbin Li
Hakim Hacid
Ebtesam Almazrouei
Merouane Debbah
166
16
0
16 Feb 2023
Broken Neural Scaling Laws
Ethan Caballero
Kshitij Gupta
Irina Rish
David M. Krueger
466
90
0
26 Oct 2022
An Information Theory-inspired Strategy for Automatic Network Pruning
Xiawu Zheng
Yuexiao Ma
Teng Xi
Gang Zhang
Errui Ding
Yuchao Li
Jie Chen
Yonghong Tian
Rongrong Ji
361
15
0
19 Aug 2021
Previous
1
2
3
4
5
6