Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2005.00561
Cited By
When BERT Plays the Lottery, All Tickets Are Winning
1 May 2020
Sai Prasanna
Anna Rogers
Anna Rumshisky
MILM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"When BERT Plays the Lottery, All Tickets Are Winning"
37 / 37 papers shown
Title
Are formal and functional linguistic mechanisms dissociated in language models?
Michael Hanna
Sandro Pezzelle
Yonatan Belinkov
50
0
0
14 Mar 2025
NeuroPrune: A Neuro-inspired Topological Sparse Training Algorithm for Large Language Models
Amit Dhurandhar
Tejaswini Pedapati
Ronny Luss
Soham Dan
Aurélie C. Lozano
Payel Das
Georgios Kollias
22
3
0
28 Feb 2024
Adaptive Sparsity Level during Training for Efficient Time Series Forecasting with Transformers
Zahra Atashgahi
Mykola Pechenizkiy
Raymond N. J. Veldhuis
Decebal Constantin Mocanu
AI4TS
AI4CE
34
1
0
28 May 2023
Gradient-Free Structured Pruning with Unlabeled Data
Azade Nova
H. Dai
Dale Schuurmans
SyDa
40
20
0
07 Mar 2023
MUX-PLMs: Data Multiplexing for High-throughput Language Models
Vishvak Murahari
Ameet Deshpande
Carlos E. Jimenez
Izhak Shafran
Mingqiu Wang
Yuan Cao
Karthik Narasimhan
MoE
26
5
0
24 Feb 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
32
73
0
22 Feb 2023
An Empirical Study on the Transferability of Transformer Modules in Parameter-Efficient Fine-Tuning
Mohammad AkbarTajari
S. Rajaee
Mohammad Taher Pilehvar
11
2
0
01 Feb 2023
Robust Lottery Tickets for Pre-trained Language Models
Rui Zheng
Rong Bao
Yuhao Zhou
Di Liang
Sirui Wang
Wei Wu
Tao Gui
Qi Zhang
Xuanjing Huang
AAML
30
13
0
06 Nov 2022
Data-Efficient Cross-Lingual Transfer with Language-Specific Subnetworks
Rochelle Choenni
Dan Garrette
Ekaterina Shutova
24
2
0
31 Oct 2022
Compressing And Debiasing Vision-Language Pre-Trained Models for Visual Question Answering
Q. Si
Yuanxin Liu
Zheng Lin
Peng Fu
Weiping Wang
VLM
42
1
0
26 Oct 2022
Parameter-Efficient Tuning with Special Token Adaptation
Xiaoocong Yang
James Y. Huang
Wenxuan Zhou
Muhao Chen
34
12
0
10 Oct 2022
Doge Tickets: Uncovering Domain-general Language Models by Playing Lottery Tickets
Yi Yang
Chen Zhang
Benyou Wang
Dawei Song
LRM
24
6
0
20 Jul 2022
The Topological BERT: Transforming Attention into Topology for Natural Language Processing
Ilan Perez
Raphael Reinauer
30
17
0
30 Jun 2022
Diverse Lottery Tickets Boost Ensemble from a Single Pretrained Model
Sosuke Kobayashi
Shun Kiyono
Jun Suzuki
Kentaro Inui
MoMe
29
7
0
24 May 2022
Outliers Dimensions that Disrupt Transformers Are Driven by Frequency
Giovanni Puccetti
Anna Rogers
Aleksandr Drozd
F. Dell’Orletta
79
42
0
23 May 2022
Life after BERT: What do Other Muppets Understand about Language?
Vladislav Lialin
Kevin Zhao
Namrata Shivagunde
Anna Rumshisky
47
6
0
21 May 2022
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
9
177
0
01 Apr 2022
Playing Lottery Tickets in Style Transfer Models
Meihao Kong
Jing Huo
Wenbin Li
Jing Wu
Yu-Kun Lai
Yang Gao
27
1
0
25 Mar 2022
Can Model Compression Improve NLP Fairness
Guangxuan Xu
Qingyuan Hu
31
26
0
21 Jan 2022
From Dense to Sparse: Contrastive Pruning for Better Pre-trained Language Model Compression
Runxin Xu
Fuli Luo
Chengyu Wang
Baobao Chang
Jun Huang
Songfang Huang
Fei Huang
VLM
27
25
0
14 Dec 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
34
40
0
23 Nov 2021
Recent Advances in Natural Language Processing via Large Pre-Trained Language Models: A Survey
Bonan Min
Hayley L Ross
Elior Sulem
Amir Pouran Ben Veyseh
Thien Huu Nguyen
Oscar Sainz
Eneko Agirre
Ilana Heinz
Dan Roth
LM&MA
VLM
AI4CE
83
1,035
0
01 Nov 2021
Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization
Panjie Qi
E. Sha
Qingfeng Zhuge
Hongwu Peng
Shaoyi Huang
Zhenglun Kong
Yuhong Song
Bingbing Li
11
49
0
19 Oct 2021
Understanding and Overcoming the Challenges of Efficient Transformer Quantization
Yelysei Bondarenko
Markus Nagel
Tijmen Blankevoort
MQ
25
133
0
27 Sep 2021
Survey: Transformer based Video-Language Pre-training
Ludan Ruan
Qin Jin
VLM
ViT
72
44
0
21 Sep 2021
What's Hidden in a One-layer Randomly Weighted Transformer?
Sheng Shen
Z. Yao
Douwe Kiela
Kurt Keutzer
Michael W. Mahoney
32
4
0
08 Sep 2021
Differentiable Subset Pruning of Transformer Heads
Jiaoda Li
Ryan Cotterell
Mrinmaya Sachan
45
53
0
10 Aug 2021
Learned Token Pruning for Transformers
Sehoon Kim
Sheng Shen
D. Thorsley
A. Gholami
Woosuk Kwon
Joseph Hassoun
Kurt Keutzer
17
146
0
02 Jul 2021
Sanity Checks for Lottery Tickets: Does Your Winning Ticket Really Win the Jackpot?
Xiaolong Ma
Geng Yuan
Xuan Shen
Tianlong Chen
Xuxi Chen
...
Ning Liu
Minghai Qin
Sijia Liu
Zhangyang Wang
Yanzhi Wang
21
63
0
01 Jul 2021
BitFit: Simple Parameter-efficient Fine-tuning for Transformer-based Masked Language-models
Elad Ben-Zaken
Shauli Ravfogel
Yoav Goldberg
67
1,154
0
18 Jun 2021
Pre-Trained Models: Past, Present and Future
Xu Han
Zhengyan Zhang
Ning Ding
Yuxian Gu
Xiao Liu
...
Jie Tang
Ji-Rong Wen
Jinhui Yuan
Wayne Xin Zhao
Jun Zhu
AIFin
MQ
AI4MH
46
815
0
14 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
67
468
0
08 Jun 2021
Playing Lottery Tickets with Vision and Language
Zhe Gan
Yen-Chun Chen
Linjie Li
Tianlong Chen
Yu Cheng
Shuohang Wang
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
109
54
0
23 Apr 2021
The Rediscovery Hypothesis: Language Models Need to Meet Linguistics
Vassilina Nikoulina
Maxat Tezekbayev
Nuradil Kozhakhmet
Madina Babazhanova
Matthias Gallé
Z. Assylbekov
34
8
0
02 Mar 2021
The Lottery Tickets Hypothesis for Supervised and Self-supervised Pre-training in Computer Vision Models
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Michael Carbin
Zhangyang Wang
27
122
0
12 Dec 2020
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
345
0
23 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1