ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2306.11695
  4. Cited By
A Simple and Effective Pruning Approach for Large Language Models

A Simple and Effective Pruning Approach for Large Language Models

20 June 2023
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
ArXivPDFHTML

Papers citing "A Simple and Effective Pruning Approach for Large Language Models"

50 / 272 papers shown
Title
RoCoFT: Efficient Finetuning of Large Language Models with Row-Column
  Updates
RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates
Md. Kowsher
Tara Esmaeilbeig
Chun-Nam Yu
Mojtaba Soltanalian
Niloofar Yousefi
32
0
0
14 Oct 2024
Skipping Computations in Multimodal LLMs
Skipping Computations in Multimodal LLMs
Mustafa Shukor
Matthieu Cord
31
2
0
12 Oct 2024
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
DARE the Extreme: Revisiting Delta-Parameter Pruning For Fine-Tuned Models
Wenlong Deng
Yize Zhao
V. Vakilian
Minghui Chen
Xiaoxiao Li
Christos Thrampoulidis
45
3
0
12 Oct 2024
Compressing Large Language Models with Automated Sub-Network Search
Compressing Large Language Models with Automated Sub-Network Search
R. Sukthanker
B. Staffler
Frank Hutter
Aaron Klein
LRM
38
0
0
09 Oct 2024
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
ParallelSpec: Parallel Drafter for Efficient Speculative Decoding
Zilin Xiao
Hongming Zhang
Tao Ge
Siru Ouyang
Vicente Ordonez
Dong Yu
41
5
0
08 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
39
3
0
08 Oct 2024
Superficial Safety Alignment Hypothesis
Superficial Safety Alignment Hypothesis
Jianwei Li
Jung-Eun Kim
29
1
0
07 Oct 2024
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Large Language Model Inference Acceleration: A Comprehensive Hardware Perspective
Jinhao Li
Jiaming Xu
Shan Huang
Yonghua Chen
Wen Li
...
Jiayi Pan
Li Ding
Hao Zhou
Yu Wang
Guohao Dai
62
17
0
06 Oct 2024
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory
  Waveform Estimation from PPG Signals
RespDiff: An End-to-End Multi-scale RNN Diffusion Model for Respiratory Waveform Estimation from PPG Signals
Yuyang Miao
Zehua Chen
Chong Li
Danilo Mandic
DiffM
MedIm
38
0
0
06 Oct 2024
ARB-LLM: Alternating Refined Binarizations for Large Language Models
ARB-LLM: Alternating Refined Binarizations for Large Language Models
Zhiteng Li
Xinyu Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong
Yulun Zhang
Xiaokang Yang
MQ
37
2
0
04 Oct 2024
How Much Can RAG Help the Reasoning of LLM?
How Much Can RAG Help the Reasoning of LLM?
Jingyu Liu
Jiaen Lin
Yong Liu
LRM
39
9
0
03 Oct 2024
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model
  Compression
Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression
Jingcun Wang
Yu-Guang Chen
Ing-Chao Lin
Bing Li
Grace Li Zhang
35
4
0
02 Oct 2024
Do Influence Functions Work on Large Language Models?
Do Influence Functions Work on Large Language Models?
Zhe Li
Wei Zhao
Yige Li
Jun Sun
TDI
36
1
0
30 Sep 2024
Two Sparse Matrices are Better than One: Sparsifying Neural Networks
  with Double Sparse Factorization
Two Sparse Matrices are Better than One: Sparsifying Neural Networks with Double Sparse Factorization
Vladimír Boža
Vladimír Macko
30
1
0
27 Sep 2024
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models
Gongfan Fang
Hongxu Yin
Saurav Muralidharan
Greg Heinrich
Jeff Pool
Jan Kautz
Pavlo Molchanov
Xinchao Wang
37
3
0
26 Sep 2024
Enhancing Aspect-based Sentiment Analysis in Tourism Using Large
  Language Models and Positional Information
Enhancing Aspect-based Sentiment Analysis in Tourism Using Large Language Models and Positional Information
Chun Xu
Mengmeng Wang
Yan Ren
Shaolin Zhu
32
1
0
23 Sep 2024
OStr-DARTS: Differentiable Neural Architecture Search based on Operation
  Strength
OStr-DARTS: Differentiable Neural Architecture Search based on Operation Strength
Le Yang
Ziwei Zheng
Yizeng Han
Shiji Song
Gao Huang
Fan Li
26
1
0
22 Sep 2024
On Importance of Pruning and Distillation for Efficient Low Resource NLP
On Importance of Pruning and Distillation for Efficient Low Resource NLP
Aishwarya Mirashi
Purva Lingayat
Srushti Sonavane
Tejas Padhiyar
Raviraj Joshi
Geetanjali Kale
34
1
0
21 Sep 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
Vardan Papyan
VLM
51
1
0
20 Sep 2024
KVPruner: Structural Pruning for Faster and Memory-Efficient Large
  Language Models
KVPruner: Structural Pruning for Faster and Memory-Efficient Large Language Models
Bo Lv
Quan Zhou
Xuanang Ding
Yan Wang
Zeming Ma
VLM
32
1
0
17 Sep 2024
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Yuezhou Hu
Jun-Jie Zhu
Jianfei Chen
45
0
0
13 Sep 2024
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning
Jaeseong Lee
seung-won hwang
Aurick Qiao
Daniel F Campos
Z. Yao
Yuxiong He
30
2
0
10 Sep 2024
Wavelet GPT: Wavelet Inspired Large Language Models
Wavelet GPT: Wavelet Inspired Large Language Models
Prateek Verma
AI4TS
23
0
0
04 Sep 2024
Mixed Sparsity Training: Achieving 4$\times$ FLOP Reduction for
  Transformer Pretraining
Mixed Sparsity Training: Achieving 4×\times× FLOP Reduction for Transformer Pretraining
Pihe Hu
Shaolong Li
Longbo Huang
33
0
0
21 Aug 2024
Enhancing One-shot Pruned Pre-trained Language Models through
  Sparse-Dense-Sparse Mechanism
Enhancing One-shot Pruned Pre-trained Language Models through Sparse-Dense-Sparse Mechanism
Guanchen Li
Xiandong Zhao
Lian Liu
Zeping Li
Dong Li
Lu Tian
Jie He
Ashish Sirasao
E. Barsoum
VLM
34
0
0
20 Aug 2024
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Jian Chen
Vashisth Tiwari
Ranajoy Sadhukhan
Zhuoming Chen
Jinyuan Shi
Ian En-Hsu Yen
Ian En-Hsu Yen
Avner May
Tianqi Chen
Beidi Chen
LRM
39
22
0
20 Aug 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
71
8
0
19 Aug 2024
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large
  Language Models
FactorLLM: Factorizing Knowledge via Mixture of Experts for Large Language Models
Zhongyu Zhao
Menghang Dong
Rongyu Zhang
Wenzhao Zheng
Yunpeng Zhang
Huanrui Yang
Dalong Du
Kurt Keutzer
Shanghang Zhang
58
0
0
15 Aug 2024
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference
  Serving at Scale
LLMServingSim: A HW/SW Co-Simulation Infrastructure for LLM Inference Serving at Scale
Jaehong Cho
Minsu Kim
Hyunmin Choi
Guseul Heo
Jongse Park
49
9
0
10 Aug 2024
A Convex-optimization-based Layer-wise Post-training Pruner for Large
  Language Models
A Convex-optimization-based Layer-wise Post-training Pruner for Large Language Models
Pengxiang Zhao
Hanyu Hu
Ping Li
Yi Zheng
Zhefeng Wang
Xiaoming Yuan
44
1
0
07 Aug 2024
Logistic Regression makes small LLMs strong and explainable
  "tens-of-shot" classifiers
Logistic Regression makes small LLMs strong and explainable "tens-of-shot" classifiers
Marcus Buckmann
Edward Hill
40
2
0
06 Aug 2024
Inference Optimizations for Large Language Models: Effects, Challenges,
  and Practical Considerations
Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations
Leo Donisch
Sigurd Schacht
Carsten Lanquillon
30
2
0
06 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
58
4
0
03 Aug 2024
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Mission Impossible: A Statistical Perspective on Jailbreaking LLMs
Jingtong Su
Mingyu Lee
SangKeun Lee
46
8
0
02 Aug 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse
  Training
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
35
5
0
30 Jul 2024
ThinK: Thinner Key Cache by Query-Driven Pruning
ThinK: Thinner Key Cache by Query-Driven Pruning
Yuhui Xu
Zhanming Jie
Hanze Dong
Lei Wang
Xudong Lu
Aojun Zhou
Amrita Saha
Caiming Xiong
Doyen Sahoo
75
15
0
30 Jul 2024
Greedy Output Approximation: Towards Efficient Structured Pruning for
  LLMs Without Retraining
Greedy Output Approximation: Towards Efficient Structured Pruning for LLMs Without Retraining
Jianwei Li
Yijun Dong
Qi Lei
32
5
0
26 Jul 2024
Efficient Inference of Vision Instruction-Following Models with Elastic
  Cache
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
Zuyan Liu
Benlin Liu
Jiahui Wang
Yuhao Dong
Guangyi Chen
Yongming Rao
Ranjay Krishna
Jiwen Lu
VLM
48
9
0
25 Jul 2024
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Qichen Fu
Minsik Cho
Thomas Merth
Sachin Mehta
Mohammad Rastegari
Mahyar Najibi
52
26
0
19 Jul 2024
Reconstruct the Pruned Model without Any Retraining
Reconstruct the Pruned Model without Any Retraining
Pingjie Wang
Ziqing Fan
Shengchao Hu
Zhe Chen
Yanfeng Wang
Yu Wang
50
1
0
18 Jul 2024
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
MINI-LLM: Memory-Efficient Structured Pruning for Large Language Models
Hongrong Cheng
Miao Zhang
J. Q. Shi
57
2
0
16 Jul 2024
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from
  Low-Rank Gradients
From GaLore to WeLore: How Low-Rank Weights Non-uniformly Emerge from Low-Rank Gradients
Ajay Jaiswal
Lu Yin
Zhenyu Zhang
Shiwei Liu
Jiawei Zhao
Yuandong Tian
Zhangyang Wang
38
14
0
15 Jul 2024
Real-Time Anomaly Detection and Reactive Planning with Large Language
  Models
Real-Time Anomaly Detection and Reactive Planning with Large Language Models
Rohan Sinha
Amine Elhafsi
Christopher Agia
Matthew Foutter
Edward Schmerling
Marco Pavone
OffRL
LRM
45
27
0
11 Jul 2024
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A
  Survey
Adversarial Attacks and Defenses on Text-to-Image Diffusion Models: A Survey
Chenyu Zhang
Mingwang Hu
Wenhui Li
Lanjun Wang
41
15
0
10 Jul 2024
Composable Interventions for Language Models
Composable Interventions for Language Models
Arinbjorn Kolbeinsson
Kyle O'Brien
Tianjin Huang
Shanghua Gao
Shiwei Liu
...
Anurag J. Vaidya
Faisal Mahmood
Marinka Zitnik
Tianlong Chen
Thomas Hartvigsen
KELM
MU
89
5
0
09 Jul 2024
Isomorphic Pruning for Vision Models
Isomorphic Pruning for Vision Models
Gongfan Fang
Xinyin Ma
Michael Bi Mi
Xinchao Wang
VLM
ViT
42
6
0
05 Jul 2024
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models:
  Enhancing Performance and Reducing Inference Costs
Efficient Expert Pruning for Sparse Mixture-of-Experts Language Models: Enhancing Performance and Reducing Inference Costs
Enshu Liu
Junyi Zhu
Zinan Lin
Xuefei Ning
Matthew B. Blaschko
Shengen Yan
Guohao Dai
Huazhong Yang
Yu Wang
MoE
62
5
0
01 Jul 2024
VcLLM: Video Codecs are Secretly Tensor Codecs
VcLLM: Video Codecs are Secretly Tensor Codecs
Ceyu Xu
Yongji Wu
Xinyu Yang
Beidi Chen
Matthew Lentz
Danyang Zhuo
Lisa Wu Wills
50
0
0
29 Jun 2024
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large
  Language and Vision-Language Models
JailbreakZoo: Survey, Landscapes, and Horizons in Jailbreaking Large Language and Vision-Language Models
Haibo Jin
Leyang Hu
Xinuo Li
Peiyan Zhang
Chonghan Chen
Jun Zhuang
Haohan Wang
PILM
36
26
0
26 Jun 2024
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and
  Optimizing the Right Coordinate Blocks
BlockLLM: Memory-Efficient Adaptation of LLMs by Selecting and Optimizing the Right Coordinate Blocks
A. Ramesh
Vignesh Ganapathiraman
I. Laradji
Mark W. Schmidt
40
1
0
25 Jun 2024
Previous
123456
Next