ResearchTrend.AI
  • Communities
  • Connect sessions
  • AI calendar
  • Organizations
  • Contact Sales
Papers
Communities
Social Events
Terms and Conditions
Pricing
Contact Sales
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2301.00774
  4. Cited By
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
v1v2v3 (latest)

SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot

2 January 2023
Elias Frantar
Dan Alistarh
    VLM
ArXiv (abs)PDFHTMLHuggingFace (3 upvotes)Github (799★)

Papers citing "SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot"

50 / 287 papers shown
Title
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for
  Interpreting Neural Networks
Missed Causes and Ambiguous Effects: Counterfactuals Pose Challenges for Interpreting Neural Networks
Aaron Mueller
CML
124
15
0
05 Jul 2024
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing
Boyan Gao
Zheng Zhang
David A. Clifton
Shitao Xiao
Li Du
Guoqi Li
Jiajun Zhang
229
13
0
05 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Zongzhang Zhang
Di He
KELM
157
1
0
03 Jul 2024
Learning Neural Networks with Sparse Activations
Learning Neural Networks with Sparse Activations
Pranjal Awasthi
Nishanth Dikkala
Pritish Kamath
Raghu Meka
179
5
0
26 Jun 2024
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer Merging
Deyuan Liu
Zhan Qin
Han Wang
Zhao Yang
Zecheng Wang
...
Zhao Lv
Zhiying Tu
Dianhui Chu
Bo Li
Dianbo Sui
188
4
0
24 Jun 2024
BlockPruner: Fine-grained Pruning for Large Language Models
BlockPruner: Fine-grained Pruning for Large Language Models
Longguang Zhong
Fanqi Wan
Ruijun Chen
Xiaojun Quan
Liangzhi Li
197
15
0
15 Jun 2024
AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers
AdaPTwin: Low-Cost Adaptive Compression of Product Twins in Transformers
Emil Biju
Anirudh Sriram
Mert Pilanci
123
0
0
13 Jun 2024
ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Xiang Meng
Kayhan Behdin
Haoyue Wang
Rahul Mazumder
112
10
0
12 Jun 2024
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via
  Adaptive Heads Fusion
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
Yilong Chen
Linhao Zhang
Junyuan Shang
Ying Tai
Tingwen Liu
Shuohuan Wang
Yu Sun
91
3
0
03 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
177
16
0
31 May 2024
Occam Gradient Descent
Occam Gradient Descent
B. N. Kausik
ODLVLM
153
0
0
30 May 2024
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models
Chankyu Lee
Rajarshi Roy
Mengyao Xu
Jonathan Raiman
Mohammad Shoeybi
Bryan Catanzaro
Ming-Yu Liu
RALM
419
298
0
27 May 2024
Subspace Node Pruning
Subspace Node Pruning
Joshua Offergeld
Marcel van Gerven
Nasir Ahmad
112
0
0
26 May 2024
Large Language Model Pruning
Large Language Model Pruning
Hanjuan Huang
Hao-Jia Song
H. Pao
202
0
0
24 May 2024
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
SliM-LLM: Salience-Driven Mixed-Precision Quantization for Large Language Models
Wei Huang
Haotong Qin
Yangdong Liu
Yawei Li
Qinshuo Liu
Xianglong Liu
Luca Benini
Michele Magno
Shiming Zhang
Xiaojuan Qi
MQ
217
26
0
23 May 2024
Pruning as a Domain-specific LLM Extractor
Pruning as a Domain-specific LLM Extractor
Nan Zhang
Yanchi Liu
Xujiang Zhao
Wei Cheng
Runxue Bao
Rui Zhang
Prasenjit Mitra
Haifeng Chen
85
17
0
10 May 2024
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU
  Heterogeneity
Mélange: Cost Efficient Large Language Model Serving by Exploiting GPU Heterogeneity
Tyler Griggs
Xiaoxuan Liu
Jiaxiang Yu
Doyoung Kim
Wei-Lin Chiang
Alvin Cheung
Ion Stoica
188
21
0
22 Apr 2024
SparseDM: Toward Sparse Efficient Diffusion Models
SparseDM: Toward Sparse Efficient Diffusion Models
Kafeng Wang
Jianfei Chen
He Li
Zhenpeng Mi
Jun-Jie Zhu
DiffM
266
13
0
16 Apr 2024
Language Model Cascades: Token-level uncertainty and beyond
Language Model Cascades: Token-level uncertainty and beyond
Neha Gupta
Harikrishna Narasimhan
Wittawat Jitkrittum
A. S. Rawat
A. Menon
Sanjiv Kumar
UQLM
187
73
0
15 Apr 2024
CQIL: Inference Latency Optimization with Concurrent Computation of
  Quasi-Independent Layers
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
Longwei Zou
Qingyang Wang
Han Zhao
Tingfeng Liu
Yi Yang
Yangdong Deng
133
0
0
10 Apr 2024
Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind
Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind
Hongchuan Zeng
Hongshen Xu
Lu Chen
Kai Yu
148
7
0
06 Apr 2024
LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions
  with Large Language Models
LC-LLM: Explainable Lane-Change Intention and Trajectory Predictions with Large Language Models
Mingxing Peng
Xusen Guo
Xianda Chen
Meixin Zhu
Kehua Chen
Hao
Hao Yang
Xuesong Wang
Yinhai Wang
LRM
150
32
0
27 Mar 2024
AI and Memory Wall
AI and Memory Wall
A. Gholami
Z. Yao
Sehoon Kim
Coleman Hooper
Michael W. Mahoney
Kurt Keutzer
128
206
0
21 Mar 2024
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient
  LLMs Under Compression
Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression
Junyuan Hong
Jinhao Duan
Chenhui Zhang
Zhangheng Li
Chulin Xie
...
B. Kailkhura
Dan Hendrycks
Dawn Song
Zhangyang Wang
Yue Liu
171
36
0
18 Mar 2024
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang
Yu Zheng
Zhongwei Wan
Mi Zhang
MQ
277
102
0
12 Mar 2024
IntactKV: Improving Large Language Model Quantization by Keeping Pivot
  Tokens Intact
IntactKV: Improving Large Language Model Quantization by Keeping Pivot Tokens Intact
Ruikang Liu
Haoli Bai
Haokun Lin
Yuening Li
Han Gao
Zheng-Jun Xu
Lu Hou
Jun Yao
Chun Yuan
MQ
143
39
0
02 Mar 2024
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for
  Large Language Models
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
181
3
0
28 Feb 2024
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
T. Yasuda
Kyriakos Axiotis
Gang Fu
M. Bateni
Vahab Mirrokni
304
0
0
27 Feb 2024
Data-free Weight Compress and Denoise for Large Language Models
Data-free Weight Compress and Denoise for Large Language Models
Runyu Peng
Yunhua Zhou
Qipeng Guo
Yang Gao
Hang Yan
Xipeng Qiu
Dahua Lin
203
3
0
26 Feb 2024
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity
  within Large Language Models
ProSparse: Introducing and Enhancing Intrinsic Activation Sparsity within Large Language Models
Chenyang Song
Xu Han
Zhengyan Zhang
Shengding Hu
Xiyu Shi
...
Chen Chen
Zhiyuan Liu
Guanglin Li
Tao Yang
Maosong Sun
221
38
0
21 Feb 2024
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding
Zhuoming Chen
Avner May
Ruslan Svirschevski
Yuhsun Huang
Max Ryabinin
Zhihao Jia
Beidi Chen
168
62
0
19 Feb 2024
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient
  Sparsity Allocation
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
Peng Xu
Wenqi Shao
Mengzhao Chen
Shitao Tang
Kai-Chuang Zhang
Shiyang Feng
Fengwei An
Yu Qiao
Ping Luo
MoE
187
38
0
18 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
161
5
0
18 Feb 2024
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Any-Precision LLM: Low-Cost Deployment of Multiple, Different-Sized LLMs
Yeonhong Park
Jake Hyun
SangLyul Cho
Bonggeun Sim
Jae W. Lee
MQ
197
25
0
16 Feb 2024
Squat: Quant Small Language Models on the Edge
Squat: Quant Small Language Models on the Edge
Xuan Shen
Zhenglun Kong
Zhenglun Kong
Zhaoyang Han
Changdi Yang
...
Lei Lu
Cheng Lyu
Zhihao Shu
Wei Niu
Miriam Leeser
MQ
194
19
0
16 Feb 2024
Towards Meta-Pruning via Optimal Transport
Towards Meta-Pruning via Optimal Transport
Alexander Theus
Olin Geimer
Friedrich Wicke
Thomas Hofmann
Sotiris Anagnostidis
Sidak Pal Singh
MoMe
143
5
0
12 Feb 2024
RepQuant: Towards Accurate Post-Training Quantization of Large
  Transformer Models via Scale Reparameterization
RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization
Zhikai Li
Xuewen Liu
Jing Zhang
Qingyi Gu
MQ
149
8
0
08 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
148
42
0
08 Feb 2024
The Fine-Grained Complexity of Gradient Computation for Training Large
  Language Models
The Fine-Grained Complexity of Gradient Computation for Training Large Language Models
Josh Alman
Zhao Song
105
21
0
07 Feb 2024
APT: Adaptive Pruning and Tuning Pretrained Language Models for
  Efficient Training and Inference
APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference
Bowen Zhao
Hannaneh Hajishirzi
Qingqing Cao
203
24
0
22 Jan 2024
The LLM Surgeon
The LLM Surgeon
Tycho F. A. van der Ouderaa
Markus Nagel
M. V. Baalen
Yuki Markus Asano
Tijmen Blankevoort
154
22
0
28 Dec 2023
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Towards Efficient Generative Large Language Model Serving: A Survey from Algorithms to Systems
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Hongyi Jin
Tianqi Chen
Zhihao Jia
190
96
0
23 Dec 2023
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
PERP: Rethinking the Prune-Retrain Paradigm in the Era of LLMs
Max Zimmer
Megi Andoni
Christoph Spiegel
Sebastian Pokutta
VLM
317
10
0
23 Dec 2023
Fluctuation-based Adaptive Structured Pruning for Large Language Models
Fluctuation-based Adaptive Structured Pruning for Large Language Models
Yongqi An
Xu Zhao
Tao Yu
Ming Tang
Jinqiao Wang
143
78
0
19 Dec 2023
Large Multimodal Model Compression via Efficient Pruning and
  Distillation at AntGroup
Large Multimodal Model Compression via Efficient Pruning and Distillation at AntGroup
Maolin Wang
Yao-Min Zhao
Jiajia Liu
Jingdong Chen
Chenyi Zhuang
Jinjie Gu
Ruocheng Guo
Xiangyu Zhao
89
7
0
10 Dec 2023
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
ASVD: Activation-aware Singular Value Decomposition for Compressing Large Language Models
Zhihang Yuan
Yuzhang Shang
Yue Song
Dawei Yang
Qiang Wu
Yan Yan
Guangyu Sun
MQ
242
81
0
10 Dec 2023
A Speed Odyssey for Deployable Quantization of LLMs
A Speed Odyssey for Deployable Quantization of LLMs
Qingyuan Li
Ran Meng
Yiduo Li
Bo Zhang
Liang Li
Yifan Lu
Xiangxiang Chu
Yerui Sun
Yuchen Xie
MQ
126
9
0
16 Nov 2023
Towards the Law of Capacity Gap in Distilling Language Models
Towards the Law of Capacity Gap in Distilling Language Models
Chen Zhang
Qiuchi Li
Dawei Song
Zheyu Ye
Yan Gao
Yan Hu
ELM
175
27
0
13 Nov 2023
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without
  Full Large Language Model
CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model
Kaiyan Zhang
Ning Ding
Biqing Qi
Xuekai Zhu
Xinwei Long
Bowen Zhou
139
5
0
24 Oct 2023
Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy
  for Language Models
Towards Robust Pruning: An Adaptive Knowledge-Retention Pruning Strategy for Language Models
Jianwei Li
Qi Lei
Wei Cheng
Dongkuan Xu
KELM
127
6
0
19 Oct 2023
Previous
123456
Next