ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2504.15027
  4. Cited By
DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models

DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models

21 April 2025
Chengyu Wang
Junbing Yan
Yuanhao Yue
Jun Huang
ArXiv (abs)PDFHTML

Papers citing "DistilQwen2.5: Industrial Practices of Training Distilled Open Lightweight Language Models"

10 / 10 papers shown
Title
EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models
EasyDistill: A Comprehensive Toolkit for Effective Knowledge Distillation of Large Language Models
Chengyu Wang
Junbing Yan
Wenrui Cai
Yuanhao Yue
Jun Huang
VLM
32
0
0
27 May 2025
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Length-Controlled AlpacaEval: A Simple Way to Debias Automatic Evaluators
Yann Dubois
Balázs Galambosi
Percy Liang
Tatsunori Hashimoto
ALM
136
402
0
06 Apr 2024
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language
  Models in Multi-Turn Dialogues
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
Ge Bai
Jie Liu
Xingyuan Bu
Yancheng He
Jiaheng Liu
...
Zhuoran Lin
Wenbo Su
Tiezheng Ge
Bo Zheng
Wanli Ouyang
ELMLM&MA
100
93
0
22 Feb 2024
MUFFIN: Curating Multi-Faceted Instructions for Improving
  Instruction-Following
MUFFIN: Curating Multi-Faceted Instructions for Improving Instruction-Following
Renze Lou
Kai Zhang
Jian Xie
Yuxuan Sun
Janice Ahn
Hanzi Xu
Yu Su
Wenpeng Yin
87
30
0
05 Dec 2023
f-Divergence Minimization for Sequence-Level Knowledge Distillation
f-Divergence Minimization for Sequence-Level Knowledge Distillation
Yuqiao Wen
Zichao Li
Wenyu Du
Lili Mou
79
61
0
27 Jul 2023
Meta-KD: A Meta Knowledge Distillation Framework for Language Model
  Compression across Domains
Meta-KD: A Meta Knowledge Distillation Framework for Language Model Compression across Domains
Haojie Pan
Chengyu Wang
Minghui Qiu
Yichang Zhang
Yaliang Li
Jun Huang
76
51
0
02 Dec 2020
A Large-Scale Chinese Short-Text Conversation Dataset
A Large-Scale Chinese Short-Text Conversation Dataset
Yida Wang
Pei Ke
Yinhe Zheng
Kaili Huang
Yong Jiang
Xiaoyan Zhu
Minlie Huang
54
136
0
10 Aug 2020
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices
Zhiqing Sun
Hongkun Yu
Xiaodan Song
Renjie Liu
Yiming Yang
Denny Zhou
MQ
118
817
0
06 Apr 2020
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and
  lighter
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Victor Sanh
Lysandre Debut
Julien Chaumond
Thomas Wolf
255
7,554
0
02 Oct 2019
TinyBERT: Distilling BERT for Natural Language Understanding
TinyBERT: Distilling BERT for Natural Language Understanding
Xiaoqi Jiao
Yichun Yin
Lifeng Shang
Xin Jiang
Xiao Chen
Linlin Li
F. Wang
Qun Liu
VLM
113
1,872
0
23 Sep 2019
1