Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.21987
Cited By
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning
28 May 2025
Zhendong Mi
Zhenglun Kong
Geng Yuan
Shaoyi Huang
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning"
41 / 41 papers shown
Title
Wanda++: Pruning Large Language Models via Regional Gradients
Yifan Yang
Kai Zhen
Bhavana Ganesh
Aram Galstyan
Goeric Huybrechts
...
S. Bodapati
Nathan Susanj
Zheng Zhang
Jack FitzGerald
Abhishek Kumar
167
3
0
06 Mar 2025
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models
Peijie Dong
Lujun Li
Zhenheng Tang
Xiang Liu
Xinglin Pan
Qiang-qiang Wang
Xiaowen Chu
132
33
0
05 Jun 2024
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
413
12,076
0
18 Jul 2023
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
156
440
0
20 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDL
MQ
106
578
0
01 Jun 2023
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALM
PILM
1.5K
13,472
0
27 Feb 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Elias Frantar
Dan Alistarh
VLM
113
734
0
02 Jan 2023
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
207
839
0
18 Nov 2022
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
109
243
0
24 Aug 2022
Language model compression with weighted low-rank factorization
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
66
108
0
30 Jun 2022
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLM
OSLM
AI4CE
364
3,699
0
02 May 2022
SPDY: Accurate Pruning with Speedup Guarantees
Elias Frantar
Dan Alistarh
64
35
0
31 Jan 2022
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELM
ALM
238
5,665
0
07 Jul 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILaw
SSL
280
3,415
0
18 Apr 2021
Accelerating Sparse Deep Neural Networks
Asit K. Mishra
J. Latorre
Jeff Pool
Darko Stosic
Dusan Stosic
Ganesh Venkatesh
Chong Yu
Paulius Micikevicius
164
236
0
16 Apr 2021
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
217
227
0
31 Dec 2020
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
904
42,463
0
28 May 2020
Recipes for building an open-domain chatbot
Stephen Roller
Emily Dinan
Naman Goyal
Da Ju
Mary Williamson
...
Myle Ott
Kurt Shuster
Eric Michael Smith
Y-Lan Boureau
Jason Weston
ALM
125
1,014
0
28 Apr 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
500
20,342
0
23 Oct 2019
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
Kawin Ethayarajh
93
875
0
02 Sep 2019
Importance Estimation for Neural Network Pruning
Pavlo Molchanov
Arun Mallya
Stephen Tyree
I. Frosio
Jan Kautz
3DPC
92
885
0
25 Jun 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
247
1,560
0
24 May 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
185
2,532
0
19 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Zhen Dong
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
92
528
0
29 Apr 2019
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
165
763
0
25 Feb 2019
Rethinking the Value of Network Pruning
Zhuang Liu
Mingjie Sun
Tinghui Zhou
Gao Huang
Trevor Darrell
40
1,475
0
11 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
1.8K
95,229
0
11 Oct 2018
SNIP: Single-shot Network Pruning based on Connection Sensitivity
Namhoon Lee
Thalaiyasingam Ajanthan
Philip Torr
VLM
271
1,211
0
04 Oct 2018
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Todor Mihaylov
Peter Clark
Tushar Khot
Ashish Sabharwal
122
1,570
0
08 Sep 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
198
2,670
0
14 Mar 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
274
3,488
0
09 Mar 2018
Learning Efficient Convolutional Networks through Network Slimming
Zhuang Liu
Jianguo Li
Zhiqiang Shen
Gao Huang
Shoumeng Yan
Changshui Zhang
133
2,426
0
22 Aug 2017
Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science
Decebal Constantin Mocanu
Elena Mocanu
Peter Stone
Phuong H. Nguyen
M. Gibescu
A. Liotta
184
637
0
15 Jul 2017
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
343
2,900
0
26 Sep 2016
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
195
3,707
0
31 Aug 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
316
8,177
0
16 Jun 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
263
8,862
0
01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
316
6,709
0
08 Jun 2015
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAI
OCL
404
33,573
0
16 Oct 2013
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
693
31,553
0
16 Jan 2013
1