ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2012.06153
  4. Cited By
Improving Task-Agnostic BERT Distillation with Layer Mapping Search

Improving Task-Agnostic BERT Distillation with Layer Mapping Search

11 December 2020
Xiaoqi Jiao
Huating Chang
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
Fang Wang
Qun Liu
ArXivPDFHTML

Papers citing "Improving Task-Agnostic BERT Distillation with Layer Mapping Search"

5 / 5 papers shown
Title
f-Divergence Minimization for Sequence-Level Knowledge Distillation
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
30
53
0
27 Jul 2023
Neural Architecture Search for Effective Teacher-Student Knowledge
  Transfer in Language Models
Neural Architecture Search for Effective Teacher-Student Knowledge Transfer in Language Models
Aashka Trivedi
Takuma Udagawa
Michele Merler
Rameswar Panda
Yousef El-Kurdi
Bishwaranjan Bhattacharjee
30
7
0
16 Mar 2023
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
BERT-of-Theseus: Compressing BERT by Progressive Module Replacing
Canwen Xu
Wangchunshu Zhou
Tao Ge
Furu Wei
Ming Zhou
221
197
0
07 Feb 2020
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Q-BERT: Hessian Based Ultra Low Precision Quantization of BERT
Sheng Shen
Zhen Dong
Jiayu Ye
Linjian Ma
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
MQ
233
576
0
12 Sep 2019
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language
  Understanding
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding
Alex Jinpeng Wang
Amanpreet Singh
Julian Michael
Felix Hill
Omer Levy
Samuel R. Bowman
ELM
297
6,959
0
20 Apr 2018
1