Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2105.14444
Cited By
NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search
30 May 2021
Jin Xu
Xu Tan
Renqian Luo
Kaitao Song
Jian Li
Tao Qin
Tie-Yan Liu
MQ
Re-assign community
ArXiv
PDF
HTML
Papers citing
"NAS-BERT: Task-Agnostic and Adaptive-Size BERT Compression with Neural Architecture Search"
18 / 18 papers shown
Title
EfficientLLM: Scalable Pruning-Aware Pretraining for Architecture-Agnostic Edge Language Models
Xingrun Xing
Zheng Liu
Shitao Xiao
Boyan Gao
Yiming Liang
Wanpeng Zhang
Haokun Lin
Guoqi Li
Jiajun Zhang
LRM
64
1
0
10 Feb 2025
Merino: Entropy-driven Design for Generative Language Models on IoT Devices
Youpeng Zhao
Ming Lin
Huadong Tang
Qiang Wu
Jun Wang
83
0
0
28 Jan 2025
Efficiently Distilling LLMs for Edge Applications
Achintya Kundu
Fabian Lim
Aaron Chew
L. Wynter
Penny Chong
Rhui Dih Lee
47
6
0
01 Apr 2024
Evolutionary Neural Architecture Search for Transformer in Knowledge Tracing
Shangshang Yang
Xiaoshan Yu
Ye Tian
Xueming Yan
Haiping Ma
Xingyi Zhang
ViT
KELM
AI4Ed
24
2
0
02 Oct 2023
Meta-Tsallis-Entropy Minimization: A New Self-Training Approach for Domain Adaptation on Text Classification
Menglong Lu
Zhen Huang
Zhiliang Tian
Yunxiang Zhao
Xuanyu Fei
Dongsheng Li
OOD
34
5
0
04 Aug 2023
Sensi-BERT: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient BERT
Souvik Kundu
S. Nittur
Maciej Szankin
Sairam Sundaresan
MQ
30
2
0
14 Jul 2023
EdgeTran: Co-designing Transformers for Efficient Inference on Mobile Edge Platforms
Shikhar Tuli
N. Jha
36
3
0
24 Mar 2023
Gradient-Free Structured Pruning with Unlabeled Data
Azade Nova
H. Dai
Dale Schuurmans
SyDa
40
20
0
07 Mar 2023
Full Stack Optimization of Transformer Inference: a Survey
Sehoon Kim
Coleman Hooper
Thanakul Wattanawong
Minwoo Kang
Ruohan Yan
...
Qijing Huang
Kurt Keutzer
Michael W. Mahoney
Y. Shao
A. Gholami
MQ
36
101
0
27 Feb 2023
Design Automation for Fast, Lightweight, and Effective Deep Learning Models: A Survey
Dalin Zhang
Kaixuan Chen
Yan Zhao
B. Yang
Li-Ping Yao
Christian S. Jensen
48
3
0
22 Aug 2022
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
25
4
0
29 Jan 2022
AutoHEnsGNN: Winning Solution to AutoGraph Challenge for KDD Cup 2020
Jin Xu
Mingjian Chen
Jianqiang Huang
Xingyuan Tang
Ke Hu
Jian Li
Jia Cheng
Jun Lei
23
2
0
25 Nov 2021
Pruning Self-attentions into Convolutional Layers in Single Path
Haoyu He
Jianfei Cai
Jing Liu
Zizheng Pan
Jing Zhang
Dacheng Tao
Bohan Zhuang
ViT
34
40
0
23 Nov 2021
Towards Efficient Post-training Quantization of Pre-trained Language Models
Haoli Bai
Lu Hou
Lifeng Shang
Xin Jiang
Irwin King
M. Lyu
MQ
82
47
0
30 Sep 2021
EfficientBERT: Progressively Searching Multilayer Perceptron via Warm-up Knowledge Distillation
Chenhe Dong
Guangrun Wang
Hang Xu
Jiefeng Peng
Xiaozhe Ren
Xiaodan Liang
24
28
0
15 Sep 2021
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
229
197
0
07 Feb 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
236
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
1