Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.04992
Cited By
v1
v2
v3
v4 (latest)
Wanda++: Pruning Large Language Models via Regional Gradients
6 March 2025
Yifan Yang
Kai Zhen
Bhavana Ganesh
Aram Galstyan
Goeric Huybrechts
Markus Müller
Jonas M. Kübler
Rupak Vignesh Swaminathan
Athanasios Mouchtaris
S. Bodapati
Nathan Susanj
Zheng Zhang
Jack FitzGerald
Abhishek Kumar
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Wanda++: Pruning Large Language Models via Regional Gradients"
27 / 27 papers shown
Title
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
Zhendong Mi
Zhenglun Kong
Geng Yuan
Shaoyi Huang
30
0
0
28 May 2025
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models
Peijie Dong
Lujun Li
Zhenheng Tang
Xiang Liu
Xinglin Pan
Qiang-qiang Wang
Xiaowen Chu
132
33
0
05 Jun 2024
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos
Maximilian L. Croci
Marcelo Gennari do Nascimento
Torsten Hoefler
James Hensman
VLM
200
184
0
26 Jan 2024
Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts for Instruction Tuning on General Tasks
Haoyuan Wu
Haisheng Zheng
Zhuolun He
Bei Yu
MoE
ALM
75
16
0
05 Jan 2024
Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models
Rocktim Jyoti Das
Mingjie Sun
Liqun Ma
Zhiqiang Shen
VLM
52
19
0
08 Nov 2023
Efficient N:M Sparse DNN Training Using Algorithm, Architecture, and Dataflow Co-Design
Chao Fang
Wei Sun
Aojun Zhou
Zhongfeng Wang
44
3
0
22 Sep 2023
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
156
440
0
20 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
104
578
0
01 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
389
4,163
0
29 May 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,472
0
27 Feb 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Chen Liang
Haoming Jiang
Zheng Li
Xianfeng Tang
Bin Yin
Tuo Zhao
VLM
111
25
0
19 Feb 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Elias Frantar
Dan Alistarh
VLM
113
734
0
02 Jan 2023
BEBERT: Efficient and Robust Binary Ensemble BERT
Jiayi Tian
Chao Fang
Hong Wang
Zhongfeng Wang
MQ
82
17
0
28 Oct 2022
Structural Pruning via Latency-Saliency Knapsack
Maying Shen
Hongxu Yin
Pavlo Molchanov
Lei Mao
Jianna Liu
J. Álvarez
95
50
0
13 Oct 2022
FP8 Quantization: The Power of the Exponent
Andrey Kuzmin
M. V. Baalen
Yuwei Ren
Markus Nagel
Jorn W. T. Peters
Tijmen Blankevoort
MQ
71
86
0
19 Aug 2022
Language model compression with weighted low-rank factorization
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
66
108
0
30 Jun 2022
Using DeepSpeed and Megatron to Train Megatron-Turing NLG 530B, A Large-Scale Generative Language Model
Shaden Smith
M. Patwary
Brandon Norick
P. LeGresley
Samyam Rajbhandari
...
Mohammad Shoeybi
Yuxiong He
Michael Houston
Saurabh Tiwary
Bryan Catanzaro
MoE
163
743
0
28 Jan 2022
LoRA: Low-Rank Adaptation of Large Language Models
J. E. Hu
Yelong Shen
Phillip Wallis
Zeyuan Allen-Zhu
Yuanzhi Li
Shean Wang
Lu Wang
Weizhu Chen
OffRL
AI4TS
AI4CE
ALM
AIMat
502
10,526
0
17 Jun 2021
Revisiting Locally Supervised Learning: an Alternative to End-to-end Training
Yulin Wang
Zanlin Ni
Shiji Song
Le Yang
Gao Huang
65
85
0
26 Jan 2021
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
163
763
0
25 Feb 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,229
0
11 Oct 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
274
3,488
0
09 Mar 2018
Learning Efficient Convolutional Networks through Network Slimming
Zhuang Liu
Jianguo Li
Zhiqiang Shen
Gao Huang
Shoumeng Yan
Changshui Zhang
133
2,426
0
22 Aug 2017
An overview of gradient descent optimization algorithms
Sebastian Ruder
ODL
211
6,206
0
15 Sep 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
263
8,862
0
01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
316
6,709
0
08 Jun 2015
Adam: A Method for Stochastic Optimization
Diederik P. Kingma
Jimmy Ba
ODL
2.1K
150,364
0
22 Dec 2014
1