Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2204.00408
Cited By
Structured Pruning Learns Compact and Accurate Models
1 April 2022
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Structured Pruning Learns Compact and Accurate Models"
46 / 46 papers shown
Title
Onboard Optimization and Learning: A Survey
Monirul Islam Pavel
Siyi Hu
Mahardhika Pratama
Ryszard Kowalczyk
33
0
0
07 May 2025
FineScope : Precision Pruning for Domain-Specialized Large Language Models Using SAE-Guided Self-Data Cultivation
Chaitali Bhattacharyya
Yeseong Kim
50
0
0
01 May 2025
Soup-of-Experts: Pretraining Specialist Models via Parameters Averaging
Pierre Ablin
Angelos Katharopoulos
Skyler Seto
David Grangier
MoMe
55
0
0
03 Feb 2025
CURing Large Models: Compression via CUR Decomposition
Sanghyeon Park
Soo-Mook Moon
46
0
0
08 Jan 2025
Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Akhiad Bercovich
Tomer Ronen
Talor Abramovich
Nir Ailon
Nave Assaf
...
Ido Shahaf
Oren Tropp
Omer Ullman Argov
Ran Zilberstein
Ran El-Yaniv
91
1
0
28 Nov 2024
Understanding Layer Significance in LLM Alignment
Guangyuan Shi
Zexin Lu
Xiaoyu Dong
Wenlong Zhang
Xuanyu Zhang
Yujie Feng
Xiao-Ming Wu
58
2
0
23 Oct 2024
OATS: Outlier-Aware Pruning Through Sparse and Low Rank Decomposition
Stephen Zhang
Vardan Papyan
VLM
53
1
0
20 Sep 2024
HESSO: Towards Automatic Efficient and User Friendly Any Neural Network Training and Pruning
Tianyi Chen
Xiaoyi Qu
David Aponte
Colby R. Banbury
Jongwoo Ko
Tianyu Ding
Yong Ma
Vladimir Lyapunov
Ilya Zharkov
Luming Liang
83
1
0
11 Sep 2024
MoDeGPT: Modular Decomposition for Large Language Model Compression
Chi-Heng Lin
Shangqian Gao
James Seale Smith
Abhishek Patel
Shikhar Tuli
Yilin Shen
Hongxia Jin
Yen-Chang Hsu
71
9
0
19 Aug 2024
Finding Transformer Circuits with Edge Pruning
Adithya Bhaskar
Alexander Wettig
Dan Friedman
Danqi Chen
68
17
0
24 Jun 2024
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov
Kushal Tirumala
Hassan Shapourian
Paolo Glorioso
Daniel A. Roberts
59
84
0
26 Mar 2024
CLLMs: Consistency Large Language Models
Siqi Kou
Lanxiang Hu
Zhe He
Zhijie Deng
Hao Zhang
52
28
0
28 Feb 2024
Why Lift so Heavy? Slimming Large Language Models by Cutting Off the Layers
Shuzhou Yuan
Ercong Nie
Bolei Ma
Michael Farber
47
3
0
18 Feb 2024
Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes
Lucio Dery
Steven Kolawole
Jean-Francois Kagey
Virginia Smith
Graham Neubig
Ameet Talwalkar
52
28
0
08 Feb 2024
DE
3
^3
3
-BERT: Distance-Enhanced Early Exiting for BERT based on Prototypical Networks
Jianing He
Qi Zhang
Weiping Ding
Duoqian Miao
Jun Zhao
Liang Hu
LongBing Cao
40
3
0
03 Feb 2024
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference
Zirui Liu
Qingquan Song
Q. Xiao
Sathiya Keerthi Selvaraj
Rahul Mazumder
Aman Gupta
Xia Hu
45
4
0
08 Jan 2024
DONUT-hole: DONUT Sparsification by Harnessing Knowledge and Optimizing Learning Efficiency
Azhar Shaikh
Michael Cochez
Denis Diachkov
Michiel de Rijcke
Sahar Yousefi
43
0
0
09 Nov 2023
Learning Evaluation Models from Large Language Models for Sequence Generation
Chenglong Wang
Hang Zhou
Kai-Chun Chang
Tongran Liu
Chunliang Zhang
Quan Du
Tong Xiao
Yue Zhang
Jingbo Zhu
ELM
48
3
0
08 Aug 2023
Accurate Retraining-free Pruning for Pretrained Encoder-based Language Models
Seungcheol Park
Ho-Jin Choi
U. Kang
VLM
42
5
0
07 Aug 2023
Low-Rank Prune-And-Factorize for Language Model Compression
Siyu Ren
Kenny Q. Zhu
24
9
0
25 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
90
361
0
20 Jun 2023
LoSparse: Structured Compression of Large Language Models based on Low-Rank and Sparse Approximation
Yixiao Li
Yifan Yu
Qingru Zhang
Chen Liang
Pengcheng He
Weizhu Chen
Tuo Zhao
44
69
0
20 Jun 2023
Selective Pre-training for Private Fine-tuning
Da Yu
Sivakanth Gopi
Janardhan Kulkarni
Zinan Lin
Saurabh Naik
Tomasz Religa
Jian Yin
Huishuai Zhang
43
19
0
23 May 2023
Lifting the Curse of Capacity Gap in Distilling Language Models
Chen Zhang
Yang Yang
Jiahao Liu
Jingang Wang
Yunsen Xian
Benyou Wang
Dawei Song
MoE
32
19
0
20 May 2023
Conditional Adapters: Parameter-efficient Transfer Learning with Fast Inference
Tao Lei
Junwen Bai
Siddhartha Brahma
Joshua Ainslie
Kenton Lee
...
Vincent Zhao
Yuexin Wu
Bo-wen Li
Yu Zhang
Ming-Wei Chang
BDL
AI4CE
30
55
0
11 Apr 2023
MiniRBT: A Two-stage Distilled Small Chinese Pre-trained Model
Xin Yao
Ziqing Yang
Yiming Cui
Shijin Wang
36
3
0
03 Apr 2023
Greener yet Powerful: Taming Large Code Generation Models with Quantization
Xiaokai Wei
Sujan Kumar Gonugondla
W. Ahmad
Shiqi Wang
Baishakhi Ray
...
Ben Athiwaratkun
Mingyue Shang
M. K. Ramanathan
Parminder Bhatia
Bing Xiang
MQ
41
6
0
09 Mar 2023
Gradient-Free Structured Pruning with Unlabeled Data
Azade Nova
H. Dai
Dale Schuurmans
SyDa
40
20
0
07 Mar 2023
Structured Pruning of Self-Supervised Pre-trained Models for Speech Recognition and Understanding
Yifan Peng
Kwangyoun Kim
Felix Wu
Prashant Sridhar
Shinji Watanabe
37
34
0
27 Feb 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
41
102
0
27 Feb 2023
MUX-PLMs: Data Multiplexing for High-throughput Language Models
Vishvak Murahari
Ameet Deshpande
Carlos E. Jimenez
Izhak Shafran
Mingqiu Wang
Yuan Cao
Karthik Narasimhan
MoE
31
5
0
24 Feb 2023
HomoDistil: Homotopic Task-Agnostic Distillation of Pre-trained Transformers
Chen Liang
Haoming Jiang
Zheng Li
Xianfeng Tang
Bin Yin
Tuo Zhao
VLM
32
24
0
19 Feb 2023
What Matters In The Structured Pruning of Generative Language Models?
Michael Santacroce
Zixin Wen
Yelong Shen
Yuan-Fang Li
28
33
0
07 Feb 2023
Feasibility and Transferability of Transfer Learning: A Mathematical Framework
Haoyang Cao
Haotian Gu
Xin Guo
M. Rosenbaum
41
2
0
27 Jan 2023
Gradient-based Intra-attention Pruning on Pre-trained Language Models
Ziqing Yang
Yiming Cui
Xin Yao
Shijin Wang
VLM
42
8
0
15 Dec 2022
MPCViT: Searching for Accurate and Efficient MPC-Friendly Vision Transformer with Heterogeneous Attention
Wenyuan Zeng
Meng Li
Wenjie Xiong
Tong Tong
Wen-jie Lu
Jin Tan
Runsheng Wang
Ru Huang
29
21
0
25 Nov 2022
A Survey for Efficient Open Domain Question Answering
Qin Zhang
Shan Chen
Dongkuan Xu
Qingqing Cao
Xiaojun Chen
Trevor Cohn
Meng Fang
33
33
0
15 Nov 2022
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models
Bowen Shen
Zheng Lin
Yuanxin Liu
Zhengxiao Liu
Lei Wang
Weiping Wang
VLM
52
4
0
27 Oct 2022
Efficiently Controlling Multiple Risks with Pareto Testing
Bracha Laufer-Goldshtein
Adam Fisch
Regina Barzilay
Tommi Jaakkola
38
16
0
14 Oct 2022
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
38
109
0
31 Aug 2022
Spartan: Differentiable Sparsity via Regularized Transportation
Kai Sheng Tai
Taipeng Tian
Ser-Nam Lim
34
11
0
27 May 2022
A Survey on Model Compression and Acceleration for Pretrained Language Models
Canwen Xu
Julian McAuley
30
58
0
15 Feb 2022
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
345
0
23 Jul 2020
Comparing Rewinding and Fine-tuning in Neural Network Pruning
Alex Renda
Jonathan Frankle
Michael Carbin
235
383
0
05 Mar 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
236
578
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
7,005
0
20 Apr 2018
1