Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2102.08124
Cited By
Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks
16 February 2021
Itay Hubara
Brian Chmiel
Moshe Island
Ron Banner
S. Naor
Daniel Soudry
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Accelerated Sparse Neural Training: A Provable and Efficient Method to Find N:M Transposable Masks"
50 / 77 papers shown
Title
Efficient Mixture of Geographical Species for On Device Wildlife Monitoring
Emmanuel Azuh Mensah
Joban Mand
Yueheng Ou
Min Jang
Kurtis Heimerl
36
0
0
11 Apr 2025
Thanos: A Block-wise Pruning Algorithm for Efficient Large Language Model Compression
Ivan Ilin
Peter Richtárik
26
0
0
06 Apr 2025
STADE: Standard Deviation as a Pruning Metric
Diego Coello de Portugal Mecke
Haya Alyoussef
Ilia Koloiarov
Maximilian Stubbemann
Lars Schmidt-Thieme
34
0
0
28 Mar 2025
Theoretical Foundation of Flow-Based Time Series Generation: Provable Approximation, Generalization, and Efficiency
Jiangxuan Long
Zhao-quan Song
Chiwun Yang
AI4TS
162
0
0
18 Mar 2025
FedSpaLLM: Federated Pruning of Large Language Models
Guangji Bai
Yijiang Li
Zilinghan Li
Liang Zhao
Kibaek Kim
FedML
65
4
0
20 Feb 2025
SLoPe: Double-Pruned Sparse Plus Lazy Low-Rank Adapter Pretraining of LLMs
Mohammad Mozaffari
Amir Yazdanbakhsh
Zhao Zhang
M. Dehnavi
78
5
0
28 Jan 2025
Inducing Semi-Structured Sparsity by Masking for Efficient Model Inference in Convolutional Networks
David A. Danhofer
32
0
0
01 Nov 2024
Pruning Foundation Models for High Accuracy without Retraining
Pu Zhao
Fei Sun
Xuan Shen
Pinrui Yu
Zhenglun Kong
Yanzhi Wang
Xue Lin
33
10
0
21 Oct 2024
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Oliver Sieberling
Denis Kuznedelev
Eldar Kurtic
Dan Alistarh
MQ
24
5
0
18 Oct 2024
Mixture Compressor for Mixture-of-Experts LLMs Gains More
Wei Huang
Yue Liao
Jianhui Liu
Ruifei He
Haoru Tan
Shiming Zhang
Hongsheng Li
Si Liu
Xiaojuan Qi
MoE
39
3
0
08 Oct 2024
Aggressive Post-Training Compression on Extremely Large Language Models
Zining Zhang
Yao Chen
Bingsheng He
Zhenjie Zhang
23
0
0
30 Sep 2024
S-STE: Continuous Pruning Function for Efficient 2:4 Sparse Pre-training
Yuezhou Hu
Jun-Jie Zhu
Jianfei Chen
38
0
0
13 Sep 2024
A Tighter Complexity Analysis of SparseGPT
Xiaoyu Li
Yingyu Liang
Zhenmei Shi
Zhao-quan Song
77
22
0
22 Aug 2024
STBLLM: Breaking the 1-Bit Barrier with Structured Binary LLMs
Peijie Dong
Lujun Li
Dayou Du
Yuhan Chen
Zhenheng Tang
...
Wei Xue
Wenhan Luo
Qi-fei Liu
Yi-Ting Guo
Xiaowen Chu
MQ
50
4
0
03 Aug 2024
Pruning Large Language Models with Semi-Structural Adaptive Sparse Training
Weiyu Huang
Yuezhou Hu
Guohao Jian
Jun Zhu
Jianfei Chen
35
5
0
30 Jul 2024
Toward Efficient Permutation for Hierarchical N:M Sparsity on GPUs
Seungmin Yu
Xiaodie Yi
Hayun Lee
Dongkun Shin
24
1
0
30 Jul 2024
Nerva: a Truly Sparse Implementation of Neural Networks
Wieger Wesselink
Bram Grooten
Qiao Xiao
Cássio Machado de Campos
Mykola Pechenizkiy
30
0
0
24 Jul 2024
Let the Code LLM Edit Itself When You Edit the Code
Zhenyu He
Jun Zhang
Shengjie Luo
Jingjing Xu
Z. Zhang
Di He
KELM
36
0
0
03 Jul 2024
LayerMerge: Neural Network Depth Compression through Layer Pruning and Merging
Jinuk Kim
Marwa El Halabi
Mingi Ji
Hyun Oh Song
46
1
0
18 Jun 2024
ALPS: Improved Optimization for Highly Sparse One-Shot Pruning for Large Language Models
Xiang Meng
Kayhan Behdin
Haoyue Wang
Rahul Mazumder
42
3
0
12 Jun 2024
Effective Interplay between Sparsity and Quantization: From Theory to Practice
Simla Burcu Harma
Ayan Chakraborty
Elizaveta Kostenok
Danila Mishin
Dongho Ha
...
Martin Jaggi
Ming Liu
Yunho Oh
Suvinay Subramanian
Amir Yazdanbakhsh
MQ
44
5
0
31 May 2024
TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation
Chengcheng Feng
Mu He
Qiuyu Tian
Haojie Yin
Xiaofang Zhao
Hongwei Tang
Xingqiang Wei
DiffM
30
3
0
18 May 2024
SparseDM: Toward Sparse Efficient Diffusion Models
Kafeng Wang
Jianfei Chen
He Li
Zhenpeng Mi
Jun-Jie Zhu
DiffM
68
8
0
16 Apr 2024
Multilingual Brain Surgeon: Large Language Models Can be Compressed Leaving No Language Behind
Hongchuan Zeng
Hongshen Xu
Lu Chen
Kai Yu
53
5
0
06 Apr 2024
Accelerating Transformer Pre-training with 2:4 Sparsity
Yuezhou Hu
Kang Zhao
Weiyu Huang
Jianfei Chen
Jun Zhu
65
7
0
02 Apr 2024
Abstracting Sparse DNN Acceleration via Structured Sparse Tensor Decomposition
Geonhwa Jeong
Po-An Tsai
Abhimanyu Bambhaniya
S. Keckler
Tushar Krishna
33
7
0
12 Mar 2024
DPPA: Pruning Method for Large Language Model to Model Merging
Yaochen Zhu
Rui Xia
Jiajun Zhang
MoMe
32
4
0
05 Mar 2024
SparseLLM: Towards Global Pruning for Pre-trained Language Models
Guangji Bai
Yijiang Li
Chen Ling
Kibaek Kim
Liang Zhao
23
6
0
28 Feb 2024
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
Xudong Lu
Qi Liu
Yuhui Xu
Aojun Zhou
Siyuan Huang
Bo-Wen Zhang
Junchi Yan
Hongsheng Li
MoE
32
25
0
22 Feb 2024
A Survey on Transformer Compression
Yehui Tang
Yunhe Wang
Jianyuan Guo
Zhijun Tu
Kai Han
Hailin Hu
Dacheng Tao
37
28
0
05 Feb 2024
RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation
Mahdi Nikdan
Soroush Tabesh
Elvir Crnčević
Dan Alistarh
8
27
0
09 Jan 2024
FFSplit: Split Feed-Forward Network For Optimizing Accuracy-Efficiency Trade-off in Language Model Inference
Zirui Liu
Qingquan Song
Q. Xiao
Sathiya Keerthi Selvaraj
Rahul Mazumder
Aman Gupta
Xia Hu
35
4
0
08 Jan 2024
Fast and Optimal Weight Update for Pruned Large Language Models
Vladimír Boza
27
6
0
01 Jan 2024
MaxQ: Multi-Axis Query for N:M Sparsity Network
Jingyang Xiang
Siqi Li
Junhao Chen
Zhuangzhi Chen
Tianxin Huang
Linpeng Peng
Yong-Jin Liu
16
0
0
12 Dec 2023
Nonparametric Variational Regularisation of Pretrained Transformers
Fabio Fehr
James Henderson
43
0
0
01 Dec 2023
REST: Retrieval-Based Speculative Decoding
Zhenyu He
Zexuan Zhong
Tianle Cai
Jason D. Lee
Di He
RALM
17
76
0
14 Nov 2023
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models
Hang Shao
Bei Liu
Bo Xiao
Ke Zeng
Guanglu Wan
Yanmin Qian
44
17
0
14 Oct 2023
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Yu-xin Zhang
Lirui Zhao
Mingbao Lin
Yunyun Sun
Yiwu Yao
Xingjia Han
Jared Tanner
Shiwei Liu
Rongrong Ji
SyDa
39
40
0
13 Oct 2023
Compresso: Structured Pruning with Collaborative Prompting Learns Compact Large Language Models
Song Guo
Jiahang Xu
Li Lyna Zhang
Mao Yang
25
14
0
08 Oct 2023
SPADE: Sparsity-Guided Debugging for Deep Neural Networks
Arshia Soltani Moakhar
Eugenia Iofinova
Elias Frantar
Dan Alistarh
40
1
0
06 Oct 2023
Scaling Laws for Sparsely-Connected Foundation Models
Elias Frantar
C. Riquelme
N. Houlsby
Dan Alistarh
Utku Evci
33
35
0
15 Sep 2023
Model Compression Methods for YOLOv5: A Review
Mohammad Jani
Jamil Fayyad
Younes Al Younes
H. Najjaran
31
14
0
21 Jul 2023
Systematic Investigation of Sparse Perturbed Sharpness-Aware Minimization Optimizer
Peng Mi
Li Shen
Tianhe Ren
Yiyi Zhou
Tianshuo Xu
Xiaoshuai Sun
Tongliang Liu
Rongrong Ji
Dacheng Tao
AAML
35
2
0
30 Jun 2023
A Simple and Effective Pruning Approach for Large Language Models
Mingjie Sun
Zhuang Liu
Anna Bair
J. Zico Kolter
62
355
0
20 Jun 2023
Spatial Re-parameterization for N:M Sparsity
Yu-xin Zhang
Mingbao Lin
Yunshan Zhong
Mengzhao Chen
Rongrong Ji
44
2
0
09 Jun 2023
Dynamic Sparsity Is Channel-Level Sparsity Learner
Lu Yin
Gen Li
Meng Fang
Lijuan Shen
Tianjin Huang
Zhangyang Wang
Vlado Menkovski
Xiaolong Ma
Mykola Pechenizkiy
Shiwei Liu
30
20
0
30 May 2023
Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM Inference with Transferable Prompt
Zhaozhuo Xu
Zirui Liu
Beidi Chen
Yuxin Tang
Jue Wang
Kaixiong Zhou
Xia Hu
Anshumali Shrivastava
MQ
24
29
0
17 May 2023
SpecInfer: Accelerating Generative Large Language Model Serving with Tree-based Speculative Inference and Verification
Xupeng Miao
Gabriele Oliaro
Zhihao Zhang
Xinhao Cheng
Zeyu Wang
...
Chunan Shi
Zhuoming Chen
Daiyaan Arfeen
Reyna Abhyankar
Zhihao Jia
LRM
62
118
0
16 May 2023
JaxPruner: A concise library for sparsity research
Jooyoung Lee
Wonpyo Park
Nicole Mitchell
Jonathan Pilault
J. Obando-Ceron
...
Hong-Seok Kim
Yann N. Dauphin
Karolina Dziugaite
Pablo Samuel Castro
Utku Evci
44
14
0
27 Apr 2023
Vision Models Can Be Efficiently Specialized via Few-Shot Task-Aware Compression
Denis Kuznedelev
Soroush Tabesh
Kimia Noorbakhsh
Elias Frantar
Sara Beery
Eldar Kurtic
Dan Alistarh
MQ
VLM
26
2
0
25 Mar 2023
1
2
Next