Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2311.06382
Cited By
Transfer Learning for Structured Pruning under Limited Task Data
10 November 2023
Lucio Dery
David Grangier
Awni Y. Hannun
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Transfer Learning for Structured Pruning under Limited Task Data"
17 / 17 papers shown
Title
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
Tim Dettmers
M. Lewis
Younes Belkada
Luke Zettlemoyer
MQ
105
662
0
15 Aug 2022
Structured Pruning Learns Compact and Accurate Models
Mengzhou Xia
Zexuan Zhong
Danqi Chen
VLM
69
187
0
01 Apr 2022
Auxiliary Task Update Decomposition: The Good, The Bad and The Neutral
Lucio Dery
Yann N. Dauphin
David Grangier
MoMe
71
29
0
25 Aug 2021
Don't Stop Pretraining: Adapt Language Models to Domains and Tasks
Suchin Gururangan
Ana Marasović
Swabha Swayamdipta
Kyle Lo
Iz Beltagy
Doug Downey
Noah A. Smith
VLM
AI4CE
CLL
164
2,435
0
23 Apr 2020
What's Hidden in a Randomly Weighted Neural Network?
Vivek Ramanujan
Mitchell Wortsman
Aniruddha Kembhavi
Ali Farhadi
Mohammad Rastegari
66
361
0
29 Nov 2019
Reducing Transformer Depth on Demand with Structured Dropout
Angela Fan
Edouard Grave
Armand Joulin
120
596
0
25 Sep 2019
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
113
1,869
0
23 Sep 2019
Non-Structured DNN Weight Pruning -- Is It Beneficial in Any Platform?
Xiaolong Ma
Sheng Lin
Shaokai Ye
Zhezhi He
Linfeng Zhang
...
Deliang Fan
Xuehai Qian
Xinyu Lin
Kaisheng Ma
Yanzhi Wang
MQ
100
92
0
03 Jul 2019
Are Sixteen Heads Really Better than One?
Paul Michel
Omer Levy
Graham Neubig
MoE
107
1,068
0
25 May 2019
Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned
Elena Voita
David Talbot
F. Moiseev
Rico Sennrich
Ivan Titov
117
1,148
0
23 May 2019
Rethinking the Value of Network Pruning
Zhuang Liu
Mingjie Sun
Tinghui Zhou
Gao Huang
Trevor Darrell
38
1,474
0
11 Oct 2018
Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction
Yi Luan
Luheng He
Mari Ostendorf
Hannaneh Hajishirzi
116
684
0
29 Aug 2018
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
1.1K
7,200
0
20 Apr 2018
The Lottery Ticket Hypothesis: Finding Sparse, Trainable Neural Networks
Jonathan Frankle
Michael Carbin
263
3,488
0
09 Mar 2018
PubMed 200k RCT: a Dataset for Sequential Sentence Classification in Medical Abstracts
Franck Dernoncourt
Ji Young Lee
71
230
0
17 Oct 2017
Learning both Weights and Connections for Efficient Neural Networks
Song Han
Jeff Pool
J. Tran
W. Dally
CVBM
316
6,700
0
08 Jun 2015
The Benefit of Multitask Representation Learning
Andreas Maurer
Massimiliano Pontil
Bernardino Romera-Paredes
SSL
109
376
0
23 May 2015
1