ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2005.07683
  4. Cited By
Movement Pruning: Adaptive Sparsity by Fine-Tuning

Movement Pruning: Adaptive Sparsity by Fine-Tuning

15 May 2020
Victor Sanh
Thomas Wolf
Alexander M. Rush
ArXivPDFHTML

Papers citing "Movement Pruning: Adaptive Sparsity by Fine-Tuning"

32 / 81 papers shown
Title
Efficient Methods for Natural Language Processing: A Survey
Efficient Methods for Natural Language Processing: A Survey
Marcos Vinícius Treviso
Ji-Ung Lee
Tianchu Ji
Betty van Aken
Qingqing Cao
...
Emma Strubell
Niranjan Balasubramanian
Leon Derczynski
Iryna Gurevych
Roy Schwartz
28
109
0
31 Aug 2022
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of
  Weight Importance
PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance
Qingru Zhang
Simiao Zuo
Chen Liang
Alexander Bukharin
Pengcheng He
Weizhu Chen
T. Zhao
17
77
0
25 Jun 2022
FlashAttention: Fast and Memory-Efficient Exact Attention with
  IO-Awareness
FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
Tri Dao
Daniel Y. Fu
Stefano Ermon
Atri Rudra
Christopher Ré
VLM
63
2,024
0
27 May 2022
Spartan: Differentiable Sparsity via Regularized Transportation
Spartan: Differentiable Sparsity via Regularized Transportation
Kai Sheng Tai
Taipeng Tian
Ser-Nam Lim
25
11
0
27 May 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Sosuke Kobayashi
Shun Kiyono
Jun Suzuki
Kentaro Inui
MoMe
23
7
0
24 May 2022
Outliers Dimensions that Disrupt Transformers Are Driven by Frequency
Outliers Dimensions that Disrupt Transformers Are Driven by Frequency
Giovanni Puccetti
Anna Rogers
Aleksandr Drozd
F. Dell’Orletta
73
42
0
23 May 2022
Serving and Optimizing Machine Learning Workflows on Heterogeneous
  Infrastructures
Serving and Optimizing Machine Learning Workflows on Heterogeneous Infrastructures
Yongji Wu
Matthew Lentz
Danyang Zhuo
Yao Lu
23
22
0
10 May 2022
Monarch: Expressive Structured Matrices for Efficient and Accurate
  Training
Monarch: Expressive Structured Matrices for Efficient and Accurate Training
Tri Dao
Beidi Chen
N. Sohoni
Arjun D Desai
Michael Poli
Jessica Grogan
Alexander Liu
Aniruddh Rao
Atri Rudra
Christopher Ré
22
87
0
01 Apr 2022
Structured Pruning Learns Compact and Accurate Models
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
9
177
0
01 Apr 2022
TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
TextPruner: A Model Pruning Toolkit for Pre-Trained Language Models
Ziqing Yang
Yiming Cui
Zhigang Chen
SyDa
VLM
23
12
0
30 Mar 2022
Improve Convolutional Neural Network Pruning by Maximizing Filter
  Variety
Improve Convolutional Neural Network Pruning by Maximizing Filter Variety
Nathan Hubens
M. Mancas
B. Gosselin
Marius Preda
T. Zaharia
19
2
0
11 Mar 2022
Rare Gems: Finding Lottery Tickets at Initialization
Rare Gems: Finding Lottery Tickets at Initialization
Kartik K. Sreenivasan
Jy-yong Sohn
Liu Yang
Matthew Grinde
Alliot Nagle
Hongyi Wang
Eric P. Xing
Kangwook Lee
Dimitris Papailiopoulos
24
42
0
24 Feb 2022
Deadwooding: Robust Global Pruning for Deep Neural Networks
Deadwooding: Robust Global Pruning for Deep Neural Networks
Sawinder Kaur
Ferdinando Fioretto
Asif Salekin
19
4
0
10 Feb 2022
Accelerating DNN Training with Structured Data Gradient Pruning
Accelerating DNN Training with Structured Data Gradient Pruning
Bradley McDanel
Helia Dinh
J. Magallanes
6
7
0
01 Feb 2022
Speedup deep learning models on GPU by taking advantage of efficient
  unstructured pruning and bit-width reduction
Speedup deep learning models on GPU by taking advantage of efficient unstructured pruning and bit-width reduction
Marcin Pietroñ
Dominik Zurek
22
13
0
28 Dec 2021
From Dense to Sparse: Contrastive Pruning for Better Pre-trained
  Language Model Compression
From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression
Runxin Xu
Fuli Luo
Chengyu Wang
Baobao Chang
Jun Huang
Songfang Huang
Fei Huang
VLM
27
25
0
14 Dec 2021
Pruning Pretrained Encoders with a Multitask Objective
Pruning Pretrained Encoders with a Multitask Objective
Patrick Xia
Richard Shin
42
0
0
10 Dec 2021
Pixelated Butterfly: Simple and Efficient Sparse training for Neural
  Network Models
Pixelated Butterfly: Simple and Efficient Sparse training for Neural Network Models
Tri Dao
Beidi Chen
Kaizhao Liang
Jiaming Yang
Zhao-quan Song
Atri Rudra
Christopher Ré
33
75
0
30 Nov 2021
How Well Do Sparse Imagenet Models Transfer?
How Well Do Sparse Imagenet Models Transfer?
Eugenia Iofinova
Alexandra Peste
Mark Kurtz
Dan Alistarh
19
38
0
26 Nov 2021
Prune Once for All: Sparse Pre-Trained Language Models
Prune Once for All: Sparse Pre-Trained Language Models
Ofir Zafrir
Ariel Larey
Guy Boudoukh
Haihao Shen
Moshe Wasserblat
VLM
34
82
0
10 Nov 2021
BERMo: What can BERT learn from ELMo?
BERMo: What can BERT learn from ELMo?
Sangamesh Kodge
Kaushik Roy
35
3
0
18 Oct 2021
Learning Compact Metrics for MT
Learning Compact Metrics for MT
Amy Pu
Hyung Won Chung
Ankur P. Parikh
Sebastian Gehrmann
Thibault Sellam
22
98
0
12 Oct 2021
Structured Pattern Pruning Using Regularization
Structured Pattern Pruning Using Regularization
Dongju Park
Geunghee Lee
18
0
0
18 Sep 2021
What's Hidden in a One-layer Randomly Weighted Transformer?
What's Hidden in a One-layer Randomly Weighted Transformer?
Sheng Shen
Z. Yao
Douwe Kiela
Kurt Keutzer
Michael W. Mahoney
29
4
0
08 Sep 2021
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via
  Pluggable Prompting
LightNER: A Lightweight Tuning Paradigm for Low-resource NER via Pluggable Prompting
Xiang Chen
Lei Li
Shumin Deng
Chuanqi Tan
Changliang Xu
Fei Huang
Luo Si
Huajun Chen
Ningyu Zhang
VLM
34
65
0
31 Aug 2021
Layer-wise Model Pruning based on Mutual Information
Layer-wise Model Pruning based on Mutual Information
Chun Fan
Jiwei Li
Xiang Ao
Fei Wu
Yuxian Meng
Xiaofei Sun
46
19
0
28 Aug 2021
Differentiable Subset Pruning of Transformer Heads
Differentiable Subset Pruning of Transformer Heads
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
37
53
0
10 Aug 2021
Learned Token Pruning for Transformers
Learned Token Pruning for Transformers
Sehoon Kim
Sheng Shen
D. Thorsley
A. Gholami
Woosuk Kwon
Joseph Hassoun
Kurt Keutzer
9
145
0
02 Jul 2021
Dual-side Sparse Tensor Core
Dual-side Sparse Tensor Core
Yang-Feng Wang
Chen Zhang
Zhiqiang Xie
Cong Guo
Yunxin Liu
Jingwen Leng
12
74
0
20 May 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Vassilina Nikoulina
Maxat Tezekbayev
Nuradil Kozhakhmet
Madina Babazhanova
Matthias Gallé
Z. Assylbekov
34
8
0
02 Mar 2021
Parameter-Efficient Transfer Learning with Diff Pruning
Parameter-Efficient Transfer Learning with Diff Pruning
Demi Guo
Alexander M. Rush
Yoon Kim
13
383
0
14 Dec 2020
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot
  Learners
It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
Timo Schick
Hinrich Schütze
22
953
0
15 Sep 2020
Previous
12