Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2201.12507
Cited By
AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models
29 January 2022
Dongkuan Xu
Subhabrata Mukherjee
Xiaodong Liu
Debadeepta Dey
Wenhui Wang
Xiang Zhang
Ahmed Hassan Awadallah
Jianfeng Gao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"AutoDistil: Few-shot Task-agnostic Neural Architecture Search for Distilling Large Language Models"
5 / 5 papers shown
Title
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
44
17
0
03 May 2023
AutoMoE: Heterogeneous Mixture-of-Experts with Adaptive Computation for Efficient Neural Machine Translation
Ganesh Jawahar
Subhabrata Mukherjee
Xiaodong Liu
Young Jin Kim
Muhammad Abdul-Mageed
L. Lakshmanan
Ahmed Hassan Awadallah
Sébastien Bubeck
Jianfeng Gao
MoE
38
5
0
14 Oct 2022
S4: a High-sparsity, High-performance AI Accelerator
Ian En-Hsu Yen
Zhibin Xiao
Dongkuan Xu
25
3
0
16 Jul 2022
The Lottery Ticket Hypothesis for Pre-trained BERT Networks
Tianlong Chen
Jonathan Frankle
Shiyu Chang
Sijia Liu
Yang Zhang
Zhangyang Wang
Michael Carbin
156
345
0
23 Jul 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
304
6,996
0
20 Apr 2018
1