Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2109.07222
Cited By
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
15 September 2021
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation"
16 / 16 papers shown
Title
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Oliver Sieberling
Denis Kuznedelev
Eldar Kurtic
Dan Alistarh
MQ
24
5
0
18 Oct 2024
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
Guangrun Wang
Changlin Li
Liuchun Yuan
Jiefeng Peng
Xiaoyu Xian
Xiaodan Liang
Xiaojun Chang
Liang Lin
46
1
0
02 Mar 2024
A Comprehensive Survey of Compression Algorithms for Language Models
Seungcheol Park
Jaehyeon Choi
Sojin Lee
U. Kang
MQ
32
12
0
27 Jan 2024
Breaking through Deterministic Barriers: Randomized Pruning Mask Generation and Selection
Jianwei Li
Weizhi Gao
Qi Lei
Dongkuan Xu
24
2
0
19 Oct 2023
NASH: A Simple Unified Framework of Structured Pruning for Accelerating Encoder-Decoder Language Models
Jongwoo Ko
Seungjoon Park
Yujin Kim
Sumyeong Ahn
Du-Seong Chang
Euijai Ahn
SeYoung Yun
16
4
0
16 Oct 2023
Meta-Tsallis-Entropy Minimization: A New Self-Training Approach for Domain Adaptation on Text Classification
Menglong Lu
Zhen Huang
Zhiliang Tian
Yunxiang Zhao
Xuanyu Fei
Dongsheng Li
OOD
34
5
0
04 Aug 2023
Rethink DARTS Search Space and Renovate a New Benchmark
Jiuling Zhang
Zhiming Ding
35
1
0
12 Jun 2023
SparseNeRF: Distilling Depth Ranking for Few-shot Novel View Synthesis
Guangcong Wang
Zhaoxi Chen
Chen Change Loy
Ziwei Liu
MDE
52
178
0
28 Mar 2023
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation
Ganesh Jawahar
Subhabrata Mukherjee
Xiaodong Liu
Young Jin Kim
Muhammad Abdul-Mageed
L. Lakshmanan
Ahmed Hassan Awadallah
Sébastien Bubeck
Jianfeng Gao
MoE
30
5
0
14 Oct 2022
Examining Large Pre-Trained Language Models for Machine Translation: What You Don't Know About It
Lifeng Han
G. Erofeev
Irina Sorokina
Serge Gladkoff
Goran Nenadic
LM&MA
28
7
0
15 Sep 2022
Transkimmer: Transformer Learns to Layer-wise Skim
Yue Guan
Zhengyi Li
Jingwen Leng
Zhouhan Lin
Minyi Guo
80
38
0
15 May 2022
Meta Learning for Natural Language Processing: A Survey
Hung-yi Lee
Shang-Wen Li
Ngoc Thang Vu
54
42
0
03 May 2022
TransKD: Transformer Knowledge Distillation for Efficient Semantic Segmentation
R. Liu
Kailun Yang
Alina Roitberg
Jiaming Zhang
Kunyu Peng
Huayao Liu
Yaonan Wang
Rainer Stiefelhagen
ViT
47
36
0
27 Feb 2022
Improving Question Answering Performance Using Knowledge Distillation and Active Learning
Yasaman Boreshban
Seyed Morteza Mirbostani
Gholamreza Ghassem-Sani
Seyed Abolghasem Mirroshandel
Shahin Amiriparian
29
15
0
26 Sep 2021
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1