ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2105.12967
  4. Cited By
Selective Knowledge Distillation for Neural Machine Translation

Selective Knowledge Distillation for Neural Machine Translation

27 May 2021
Fusheng Wang
Jianhao Yan
Fandong Meng
Jie Zhou
ArXivPDFHTML

Papers citing "Selective Knowledge Distillation for Neural Machine Translation"

15 / 15 papers shown
Title
Learning Critically: Selective Self Distillation in Federated Learning on Non-IID Data
Learning Critically: Selective Self Distillation in Federated Learning on Non-IID Data
Yuting He
Yiqiang Chen
Xiaodong Yang
H. Yu
Yi-Hua Huang
Yang Gu
FedML
69
21
0
20 Apr 2025
StructVPR++: Distill Structural and Semantic Knowledge with Weighting Samples for Visual Place Recognition
StructVPR++: Distill Structural and Semantic Knowledge with Weighting Samples for Visual Place Recognition
Yanqing Shen
Sanping Zhou
Jingwen Fu
Ke Xu
Shitao Chen
N. Zheng
55
0
0
09 Mar 2025
Don't Throw Away Data: Better Sequence Knowledge Distillation
Don't Throw Away Data: Better Sequence Knowledge Distillation
Jun Wang
Eleftheria Briakou
Hamid Dadkhahi
Rishabh Agarwal
Colin Cherry
Trevor Cohn
53
5
0
15 Jul 2024
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge
  Distillation
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation
Jingxuan Wei
Linzhuang Sun
Yichong Leng
Xu Tan
Bihui Yu
Ruifeng Guo
51
3
0
23 Apr 2024
An Empirical Investigation into the Effect of Parameter Choices in
  Knowledge Distillation
An Empirical Investigation into the Effect of Parameter Choices in Knowledge Distillation
Md Arafat Sultan
Aashka Trivedi
Parul Awasthy
Avirup Sil
38
0
0
12 Jan 2024
A Systematic Study of Knowledge Distillation for Natural Language
  Generation with Pseudo-Target Training
A Systematic Study of Knowledge Distillation for Natural Language Generation with Pseudo-Target Training
Nitay Calderon
Subhabrata Mukherjee
Roi Reichart
Amir Kantor
44
17
0
03 May 2023
Greener yet Powerful: Taming Large Code Generation Models with
  Quantization
Greener yet Powerful: Taming Large Code Generation Models with Quantization
Xiaokai Wei
Sujan Kumar Gonugondla
W. Ahmad
Shiqi Wang
Baishakhi Ray
...
Ben Athiwaratkun
Mingyue Shang
M. K. Ramanathan
Parminder Bhatia
Bing Xiang
MQ
30
6
0
09 Mar 2023
StructVPR: Distill Structural Knowledge with Weighting Samples for
  Visual Place Recognition
StructVPR: Distill Structural Knowledge with Weighting Samples for Visual Place Recognition
Yanqing Shen
Sanping Zhopu
Jingwen Fu
Ke Xu
Shi-tao Chen
Nanning Zheng
39
20
0
02 Dec 2022
Summer: WeChat Neural Machine Translation Systems for the WMT22
  Biomedical Translation Task
Summer: WeChat Neural Machine Translation Systems for the WMT22 Biomedical Translation Task
Ernan Li
Fandong Meng
Jie Zhou
MedIm
13
1
0
28 Nov 2022
BJTU-WeChat's Systems for the WMT22 Chat Translation Task
BJTU-WeChat's Systems for the WMT22 Chat Translation Task
Yunlong Liang
Fandong Meng
Jinan Xu
Jinan Xu
Jie Zhou
24
2
0
28 Nov 2022
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model
  for Low-Resource Languages
SMaLL-100: Introducing Shallow Multilingual Machine Translation Model for Low-Resource Languages
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
VLM
MoE
LRM
29
20
0
20 Oct 2022
What Do Compressed Multilingual Machine Translation Models Forget?
What Do Compressed Multilingual Machine Translation Models Forget?
Alireza Mohammadshahi
Vassilina Nikoulina
Alexandre Berard
Caroline Brun
James Henderson
Laurent Besacier
AI4CE
44
9
0
22 May 2022
Nearest Neighbor Knowledge Distillation for Neural Machine Translation
Nearest Neighbor Knowledge Distillation for Neural Machine Translation
Zhixian Yang
Renliang Sun
Xiaojun Wan
18
12
0
01 May 2022
WeChat Neural Machine Translation Systems for WMT21
WeChat Neural Machine Translation Systems for WMT21
Xianfeng Zeng
Yanjun Liu
Ernan Li
Qiu Ran
Fandong Meng
Peng Li
Jinan Xu
Jie Zhou
25
20
0
05 Aug 2021
Learning Light-Weight Translation Models from Deep Transformer
Learning Light-Weight Translation Models from Deep Transformer
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
120
40
0
27 Dec 2020
1