Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1911.03588
Cited By
MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models
9 November 2019
Linqing Liu
Haiquan Wang
Jimmy J. Lin
R. Socher
Caiming Xiong
Re-assign community
ArXiv
PDF
HTML
Papers citing
"MKD: a Multi-Task Knowledge Distillation Approach for Pretrained Language Models"
5 / 5 papers shown
Title
HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression
Chenhe Dong
Yaliang Li
Ying Shen
Minghui Qiu
VLM
41
7
0
16 Oct 2021
MixKD: Towards Efficient Distillation of Large-scale Language Models
Kevin J Liang
Weituo Hao
Dinghan Shen
Yufan Zhou
Weizhu Chen
Changyou Chen
Lawrence Carin
13
73
0
01 Nov 2020
Pretrained Transformers for Text Ranking: BERT and Beyond
Jimmy J. Lin
Rodrigo Nogueira
Andrew Yates
VLM
244
612
0
13 Oct 2020
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
299
6,984
0
20 Apr 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
718
6,748
0
26 Sep 2016
1