ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2308.13958
  4. Cited By
Improving Knowledge Distillation for BERT Models: Loss Functions,
  Mapping Methods, and Weight Tuning

Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning

26 August 2023
Apoorv Dankar
Adeem Jassani
Kartikaeya Kumar
ArXivPDFHTML

Papers citing "Improving Knowledge Distillation for BERT Models: Loss Functions, Mapping Methods, and Weight Tuning"

3 / 3 papers shown
Title
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
119
7,386
0
02 Oct 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
56
1,838
0
23 Sep 2019
Neural Network Acceptability Judgments
Neural Network Acceptability Judgments
Alex Warstadt
Amanpreet Singh
Samuel R. Bowman
169
1,390
0
31 May 2018
1