ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2505.21987
  4. Cited By
ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning

28 May 2025
Zhendong Mi
Zhenglun Kong
Geng Yuan
Shaoyi Huang
ArXiv (abs)PDFHTML

Papers citing "ACE: Exploring Activation Cosine Similarity and Variance for Accurate and Calibration-Efficient LLM Pruning"

41 / 41 papers shown
Title
Wanda++: Pruning Large Language Models via Regional Gradients
Wanda++: Pruning Large Language Models via Regional Gradients
Yifan Yang
Kai Zhen
Bhavana Ganesh
Aram Galstyan
Goeric Huybrechts
...
S. Bodapati
Nathan Susanj
Zheng Zhang
Jack FitzGerald
Abhishek Kumar
167
3
0
06 Mar 2025
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large
  Language Models
Pruner-Zero: Evolving Symbolic Pruning Metric from scratch for Large Language Models
Peijie Dong
Lujun Li
Zhenheng Tang
Xiang Liu
Xinglin Pan
Qiang-qiang Wang
Xiaowen Chu
132
33
0
05 Jun 2024
Llama 2: Open Foundation and Fine-Tuned Chat Models
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MHALM
413
12,076
0
18 Jul 2023
A Simple and Effective Pruning Approach for Large Language Models
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
156
440
0
20 Jun 2023
AWQ: Activation-aware Weight Quantization for LLM Compression and
  Acceleration
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Ji Lin
Jiaming Tang
Haotian Tang
Shang Yang
Wei-Ming Chen
Wei-Chen Wang
Guangxuan Xiao
Xingyu Dang
Chuang Gan
Song Han
EDLMQ
104
578
0
01 Jun 2023
LLaMA: Open and Efficient Foundation Language Models
LLaMA: Open and Efficient Foundation Language Models
Hugo Touvron
Thibaut Lavril
Gautier Izacard
Xavier Martinet
Marie-Anne Lachaux
...
Faisal Azhar
Aurelien Rodriguez
Armand Joulin
Edouard Grave
Guillaume Lample
ALMPILM
1.5K
13,472
0
27 Feb 2023
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
Elias Frantar
Dan Alistarh
VLM
113
734
0
02 Jan 2023
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large
  Language Models
SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
Guangxuan Xiao
Ji Lin
Mickael Seznec
Hao Wu
Julien Demouth
Song Han
MQ
207
839
0
18 Nov 2022
Optimal Brain Compression: A Framework for Accurate Post-Training
  Quantization and Pruning
Optimal Brain Compression: A Framework for Accurate Post-Training Quantization and Pruning
Elias Frantar
Sidak Pal Singh
Dan Alistarh
MQ
109
243
0
24 Aug 2022
Language model compression with weighted low-rank factorization
Language model compression with weighted low-rank factorization
Yen-Chang Hsu
Ting Hua
Sung-En Chang
Qiang Lou
Yilin Shen
Hongxia Jin
66
108
0
30 Jun 2022
OPT: Open Pre-trained Transformer Language Models
OPT: Open Pre-trained Transformer Language Models
Susan Zhang
Stephen Roller
Naman Goyal
Mikel Artetxe
Moya Chen
...
Daniel Simig
Punit Singh Koura
Anjali Sridhar
Tianlu Wang
Luke Zettlemoyer
VLMOSLMAI4CE
362
3,699
0
02 May 2022
SPDY: Accurate Pruning with Speedup Guarantees
SPDY: Accurate Pruning with Speedup Guarantees
Elias Frantar
Dan Alistarh
64
35
0
31 Jan 2022
Evaluating Large Language Models Trained on Code
Evaluating Large Language Models Trained on Code
Mark Chen
Jerry Tworek
Heewoo Jun
Qiming Yuan
Henrique Pondé
...
Bob McGrew
Dario Amodei
Sam McCandlish
Ilya Sutskever
Wojciech Zaremba
ELMALM
238
5,665
0
07 Jul 2021
SimCSE: Simple Contrastive Learning of Sentence Embeddings
SimCSE: Simple Contrastive Learning of Sentence Embeddings
Tianyu Gao
Xingcheng Yao
Danqi Chen
AILawSSL
280
3,415
0
18 Apr 2021
Accelerating Sparse Deep Neural Networks
Accelerating Sparse Deep Neural Networks
Asit K. Mishra
J. Latorre
Jeff Pool
Darko Stosic
Dusan Stosic
Ganesh Venkatesh
Chong Yu
Paulius Micikevicius
164
236
0
16 Apr 2021
BinaryBERT: Pushing the Limit of BERT Quantization
BinaryBERT: Pushing the Limit of BERT Quantization
Haoli Bai
Wei Zhang
Lu Hou
Lifeng Shang
Jing Jin
Xin Jiang
Qun Liu
Michael Lyu
Irwin King
MQ
217
227
0
31 Dec 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
901
42,463
0
28 May 2020
Recipes for building an open-domain chatbot
Recipes for building an open-domain chatbot
Stephen Roller
Emily Dinan
Naman Goyal
Da Ju
Mary Williamson
...
Myle Ott
Kurt Shuster
Eric Michael Smith
Y-Lan Boureau
Jason Weston
ALM
125
1,014
0
28 Apr 2020
Exploring the Limits of Transfer Learning with a Unified Text-to-Text
  Transformer
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
500
20,342
0
23 Oct 2019
How Contextual are Contextualized Word Representations? Comparing the
  Geometry of BERT, ELMo, and GPT-2 Embeddings
How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings
Kawin Ethayarajh
93
875
0
02 Sep 2019
Importance Estimation for Neural Network Pruning
Importance Estimation for Neural Network Pruning
Pavlo Molchanov
Arun Mallya
Stephen Tyree
I. Frosio
Jan Kautz
3DPC
92
885
0
25 Jun 2019
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions
Christopher Clark
Kenton Lee
Ming-Wei Chang
Tom Kwiatkowski
Michael Collins
Kristina Toutanova
247
1,560
0
24 May 2019
HellaSwag: Can a Machine Really Finish Your Sentence?
HellaSwag: Can a Machine Really Finish Your Sentence?
Rowan Zellers
Ari Holtzman
Yonatan Bisk
Ali Farhadi
Yejin Choi
185
2,532
0
19 May 2019
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
HAWQ: Hessian AWare Quantization of Neural Networks with Mixed-Precision
Zhen Dong
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
92
528
0
29 Apr 2019
The State of Sparsity in Deep Neural Networks
The State of Sparsity in Deep Neural Networks
Trevor Gale
Erich Elsen
Sara Hooker
165
763
0
25 Feb 2019
Rethinking the Value of Network Pruning
Rethinking the Value of Network Pruning
Zhuang Liu
Mingjie Sun
Tinghui Zhou
Gao Huang
Trevor Darrell
40
1,475
0
11 Oct 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,229
0
11 Oct 2018
SNIP: Single-shot Network Pruning based on Connection Sensitivity
SNIP: Single-shot Network Pruning based on Connection Sensitivity
Namhoon Lee
Thalaiyasingam Ajanthan
Philip Torr
VLM
271
1,211
0
04 Oct 2018
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book
  Question Answering
Can a Suit of Armor Conduct Electricity? A New Dataset for Open Book Question Answering
Todor Mihaylov
Peter Clark
Tushar Khot
Ashish Sabharwal
122
1,570
0
08 Sep 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,201
0
20 Apr 2018
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning
  Challenge
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELMRALMLRM
198
2,670
0
14 Mar 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
274
3,488
0
09 Mar 2018
Learning Efficient Convolutional Networks through Network Slimming
Learning Efficient Convolutional Networks through Network Slimming
Zhuang Liu
Jianguo Li
Zhiqiang Shen
Gao Huang
Shoumeng Yan
Changshui Zhang
133
2,426
0
22 Aug 2017
Scalable Training of Artificial Neural Networks with Adaptive Sparse
  Connectivity inspired by Network Science
Scalable Training of Artificial Neural Networks with Adaptive Sparse Connectivity inspired by Network Science
Decebal Constantin Mocanu
Elena Mocanu
Peter Stone
Phuong H. Nguyen
M. Gibescu
A. Liotta
184
637
0
15 Jul 2017
Pointer Sentinel Mixture Models
Pointer Sentinel Mixture Models
Stephen Merity
Caiming Xiong
James Bradbury
R. Socher
RALM
343
2,900
0
26 Sep 2016
Pruning Filters for Efficient ConvNets
Pruning Filters for Efficient ConvNets
Hao Li
Asim Kadav
Igor Durdanovic
H. Samet
H. Graf
3DPC
195
3,707
0
31 Aug 2016
SQuAD: 100,000+ Questions for Machine Comprehension of Text
SQuAD: 100,000+ Questions for Machine Comprehension of Text
Pranav Rajpurkar
Jian Zhang
Konstantin Lopyrev
Percy Liang
RALM
316
8,177
0
16 Jun 2016
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained
  Quantization and Huffman Coding
Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding
Song Han
Huizi Mao
W. Dally
3DGS
263
8,862
0
01 Oct 2015
Learning both Weights and Connections for Efficient Neural Networks
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
316
6,709
0
08 Jun 2015
Distributed Representations of Words and Phrases and their
  Compositionality
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov
Ilya Sutskever
Kai Chen
G. Corrado
J. Dean
NAIOCL
404
33,573
0
16 Oct 2013
Efficient Estimation of Word Representations in Vector Space
Efficient Estimation of Word Representations in Vector Space
Tomas Mikolov
Kai Chen
G. Corrado
J. Dean
3DV
693
31,553
0
16 Jan 2013
1