ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.07947
  4. Cited By
Sequence-Level Knowledge Distillation

Sequence-Level Knowledge Distillation

25 June 2016
Yoon Kim
Alexander M. Rush
ArXivPDFHTML

Papers citing "Sequence-Level Knowledge Distillation"

50 / 244 papers shown
Title
Ensembling and Knowledge Distilling of Large Sequence Taggers for
  Grammatical Error Correction
Ensembling and Knowledge Distilling of Large Sequence Taggers for Grammatical Error Correction
M. Tarnavskyi
Artem Chernodub
Kostiantyn Omelianchuk
3DV
25
24
0
24 Mar 2022
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech
  Translation
Exploring Continuous Integrate-and-Fire for Adaptive Simultaneous Speech Translation
Chih-Chiang Chang
Hung-yi Lee
27
13
0
22 Mar 2022
Self-Distribution Distillation: Efficient Uncertainty Estimation
Self-Distribution Distillation: Efficient Uncertainty Estimation
Yassir Fathullah
Mark J. F. Gales
UQCV
22
11
0
15 Mar 2022
Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in
  Practice
Low-Rank Softmax Can Have Unargmaxable Classes in Theory but Rarely in Practice
Andreas Grivas
Nikolay Bogoychev
Adam Lopez
15
9
0
12 Mar 2022
Efficient Sub-structured Knowledge Distillation
Efficient Sub-structured Knowledge Distillation
Wenye Lin
Yangming Li
Lemao Liu
Shuming Shi
Haitao Zheng
12
1
0
09 Mar 2022
Relational Surrogate Loss Learning
Relational Surrogate Loss Learning
Tao Huang
Zekang Li
Hua Lu
Yong Shan
Shusheng Yang
Yang Feng
Fei Wang
Shan You
Chang Xu
24
5
0
26 Feb 2022
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq
  Generation
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
Tao Ge
Si-Qing Chen
Furu Wei
MoE
32
21
0
16 Feb 2022
Exploring the Limits of Domain-Adaptive Training for Detoxifying
  Large-Scale Language Models
Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models
Wei Ping
Ming-Yu Liu
Chaowei Xiao
P. Xu
M. Patwary
M. Shoeybi
Bo-wen Li
Anima Anandkumar
Bryan Catanzaro
25
65
0
08 Feb 2022
Improving Neural Machine Translation by Denoising Training
Improving Neural Machine Translation by Denoising Training
Liang Ding
Keqin Peng
Dacheng Tao
VLM
AI4CE
41
6
0
19 Jan 2022
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
Zhecan Wang
Noel Codella
Yen-Chun Chen
Luowei Zhou
Jianwei Yang
Xiyang Dai
Bin Xiao
Haoxuan You
Shih-Fu Chang
Lu Yuan
CLIP
VLM
22
39
0
15 Jan 2022
Can Multilinguality benefit Non-autoregressive Machine Translation?
Can Multilinguality benefit Non-autoregressive Machine Translation?
Sweta Agrawal
Julia Kreutzer
Colin Cherry
AI4CE
29
1
0
16 Dec 2021
Towards More Efficient Insertion Transformer with Fractional Positional
  Encoding
Towards More Efficient Insertion Transformer with Fractional Positional Encoding
Zhisong Zhang
Yizhe Zhang
W. Dolan
46
0
0
12 Dec 2021
Sequence-level self-learning with multiple hypotheses
Sequence-level self-learning with multiple hypotheses
K. Kumatani
Dimitrios Dimitriadis
Yashesh Gaur
R. Gmyr
Sefik Emre Eskimez
Jinyu Li
Michael Zeng
SSL
25
1
0
10 Dec 2021
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Hierarchical Knowledge Distillation for Dialogue Sequence Labeling
Shota Orihashi
Yoshihiro Yamazaki
Naoki Makishima
Mana Ihori
Akihiko Takashima
Tomohiro Tanaka
Ryo Masumura
17
0
0
22 Nov 2021
Symbolic Knowledge Distillation: from General Language Models to
  Commonsense Models
Symbolic Knowledge Distillation: from General Language Models to Commonsense Models
Peter West
Chandrasekhar Bhagavatula
Jack Hessel
Jena D. Hwang
Liwei Jiang
Ronan Le Bras
Ximing Lu
Sean Welleck
Yejin Choi
SyDa
54
320
0
14 Oct 2021
Semi-Autoregressive Image Captioning
Semi-Autoregressive Image Captioning
Xu Yan
Zhengcong Fei
Zekang Li
Shuhui Wang
Qingming Huang
Qi Tian
35
23
0
11 Oct 2021
Multilingual AMR Parsing with Noisy Knowledge Distillation
Multilingual AMR Parsing with Noisy Knowledge Distillation
Deng Cai
Xin Li
Jackie Chun-Sing Ho
Lidong Bing
W. Lam
27
18
0
30 Sep 2021
Integrated Training for Sequence-to-Sequence Models Using
  Non-Autoregressive Transformer
Integrated Training for Sequence-to-Sequence Models Using Non-Autoregressive Transformer
Evgeniia Tokarchuk
Jan Rosendahl
Weiyue Wang
Pavel Petrushkov
Tomer Lancewicki
Shahram Khadivi
Hermann Ney
28
2
0
27 Sep 2021
Partial to Whole Knowledge Distillation: Progressive Distilling
  Decomposed Knowledge Boosts Student Better
Partial to Whole Knowledge Distillation: Progressive Distilling Decomposed Knowledge Boosts Student Better
Xuanyang Zhang
Xinming Zhang
Jian Sun
25
1
0
26 Sep 2021
Beyond Distillation: Task-level Mixture-of-Experts for Efficient
  Inference
Beyond Distillation: Task-level Mixture-of-Experts for Efficient Inference
Sneha Kudugunta
Yanping Huang
Ankur Bapna
M. Krikun
Dmitry Lepikhin
Minh-Thang Luong
Orhan Firat
MoE
119
107
0
24 Sep 2021
The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21
The Volctrans GLAT System: Non-autoregressive Translation Meets WMT21
Lihua Qian
Yi Zhou
Zaixiang Zheng
Yaoming Zhu
Zehui Lin
Jiangtao Feng
Shanbo Cheng
Lei Li
Mingxuan Wang
Hao Zhou
31
18
0
23 Sep 2021
TranslateLocally: Blazing-fast translation running on the local CPU
TranslateLocally: Blazing-fast translation running on the local CPU
Nikolay Bogoychev
Jelmer Van der Linde
Kenneth Heafield
22
3
0
21 Sep 2021
The NiuTrans System for WNGT 2020 Efficiency Task
The NiuTrans System for WNGT 2020 Efficiency Task
Chi Hu
Bei Li
Ye Lin
Yinqiao Li
Yanyang Li
Chenglong Wang
Tong Xiao
Jingbo Zhu
23
7
0
16 Sep 2021
The NiuTrans System for the WMT21 Efficiency Task
The NiuTrans System for the WMT21 Efficiency Task
Chenglong Wang
Chi Hu
Yongyu Mu
Zhongxiang Yan
Siming Wu
...
Hang Cao
Bei Li
Ye Lin
Tong Xiao
Jingbo Zhu
29
2
0
16 Sep 2021
Improving Neural Machine Translation by Bidirectional Training
Improving Neural Machine Translation by Bidirectional Training
Liang Ding
Di Wu
Dacheng Tao
29
29
0
16 Sep 2021
Scaling Laws for Neural Machine Translation
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani
Orhan Firat
Markus Freitag
Ankur Bapna
M. Krikun
Xavier Garcia
Ciprian Chelba
Colin Cherry
40
99
0
16 Sep 2021
AligNART: Non-autoregressive Neural Machine Translation by Jointly
  Learning to Estimate Alignment and Translate
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
Jongyoon Song
Sungwon Kim
Sungroh Yoon
74
37
0
14 Sep 2021
IndicBART: A Pre-trained Model for Indic Natural Language Generation
IndicBART: A Pre-trained Model for Indic Natural Language Generation
Raj Dabre
Himani Shrotriya
Anoop Kunchukuttan
Ratish Puduppully
Mitesh M. Khapra
Pratyush Kumar
39
70
0
07 Sep 2021
Survey of Low-Resource Machine Translation
Survey of Low-Resource Machine Translation
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
31
148
0
01 Sep 2021
Learning Energy-Based Approximate Inference Networks for Structured
  Applications in NLP
Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP
Lifu Tu
BDL
35
0
0
27 Aug 2021
WeChat Neural Machine Translation Systems for WMT21
WeChat Neural Machine Translation Systems for WMT21
Xianfeng Zeng
Yanjun Liu
Ernan Li
Qiu Ran
Fandong Meng
Peng Li
Jinan Xu
Jie Zhou
25
20
0
05 Aug 2021
The USYD-JD Speech Translation System for IWSLT 2021
The USYD-JD Speech Translation System for IWSLT 2021
Liang Ding
Di Wu
Dacheng Tao
29
16
0
24 Jul 2021
Trustworthy AI: A Computational Perspective
Trustworthy AI: A Computational Perspective
Haochen Liu
Yiqi Wang
Wenqi Fan
Xiaorui Liu
Yaxin Li
Shaili Jain
Yunhao Liu
Anil K. Jain
Jiliang Tang
FaML
104
196
0
12 Jul 2021
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline
  Task
The NiuTrans End-to-End Speech Translation System for IWSLT 2021 Offline Task
Chen Xu
Xiaoqian Liu
Xiaowen Liu
Laohu Wang
Canan Huang
Tong Xiao
Jingbo Zhu
34
5
0
06 Jul 2021
ESPnet-ST IWSLT 2021 Offline Speech Translation System
ESPnet-ST IWSLT 2021 Offline Speech Translation System
Hirofumi Inaguma
Shun Kiyono
Nelson Enrique Yalta Soplin
Pengcheng Guo
Jun Suzuki
Kevin Duh
Shinji Watanabe
3DV
37
2
0
01 Jul 2021
The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at
  IWSLT 2021
The USTC-NELSLIP Systems for Simultaneous Speech Translation Task at IWSLT 2021
Dan Liu
Mengge Du
Xiaoxi Li
Yuchen Hu
Lirong Dai
19
20
0
01 Jul 2021
A Survey on Neural Speech Synthesis
A Survey on Neural Speech Synthesis
Xu Tan
Tao Qin
Frank Soong
Tie-Yan Liu
AI4TS
18
352
0
29 Jun 2021
Learning-based Framework for Sensor Fault-Tolerant Building HVAC Control
  with Model-assisted Learning
Learning-based Framework for Sensor Fault-Tolerant Building HVAC Control with Model-assisted Learning
Shichao Xu
Yangyang Fu
Yixuan Wang
Zheng O’Neill
Qi Zhu
AI4CE
11
16
0
27 Jun 2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Dealing with training and test segmentation mismatch: FBK@IWSLT2021
Sara Papi
Marco Gaido
Matteo Negri
Marco Turchi
39
6
0
23 Jun 2021
Collaborative Training of Acoustic Encoders for Speech Recognition
Collaborative Training of Acoustic Encoders for Speech Recognition
Varun K. Nagaraja
Yangyang Shi
Ganesh Venkatesh
Ozlem Kalinli
M. Seltzer
Vikas Chandra
43
11
0
16 Jun 2021
Generate, Annotate, and Learn: NLP with Synthetic Text
Generate, Annotate, and Learn: NLP with Synthetic Text
Xuanli He
Islam Nassar
J. Kiros
Gholamreza Haffari
Mohammad Norouzi
39
51
0
11 Jun 2021
Scalable Transformers for Neural Machine Translation
Scalable Transformers for Neural Machine Translation
Peng Gao
Shijie Geng
Ping Luo
Xiaogang Wang
Jifeng Dai
Hongsheng Li
31
13
0
04 Jun 2021
Diversifying Dialog Generation via Adaptive Label Smoothing
Diversifying Dialog Generation via Adaptive Label Smoothing
Yida Wang
Yinhe Zheng
Yong-jia Jiang
Minlie Huang
28
37
0
30 May 2021
Deep Learning on Monocular Object Pose Detection and Tracking: A
  Comprehensive Overview
Deep Learning on Monocular Object Pose Detection and Tracking: A Comprehensive Overview
Zhaoxin Fan
Yazhi Zhu
Yulin He
Qi Sun
Hongyan Liu
Jun He
28
82
0
29 May 2021
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
Kang Min Yoo
Dongju Park
Jaewook Kang
Sang-Woo Lee
Woomyeong Park
36
235
0
18 Apr 2021
Domain Adaptation and Multi-Domain Adaptation for Neural Machine
  Translation: A Survey
Domain Adaptation and Multi-Domain Adaptation for Neural Machine Translation: A Survey
Danielle Saunders
AI4CE
25
85
0
14 Apr 2021
The Curious Case of Hallucinations in Neural Machine Translation
The Curious Case of Hallucinations in Neural Machine Translation
Vikas Raunak
Arul Menezes
Marcin Junczys-Dowmunt
44
190
0
14 Apr 2021
A Student-Teacher Architecture for Dialog Domain Adaptation under the
  Meta-Learning Setting
A Student-Teacher Architecture for Dialog Domain Adaptation under the Meta-Learning Setting
Kun Qian
Wei Wei
Zhou Yu
15
8
0
06 Apr 2021
Compressing Visual-linguistic Model via Knowledge Distillation
Compressing Visual-linguistic Model via Knowledge Distillation
Zhiyuan Fang
Jianfeng Wang
Xiaowei Hu
Lijuan Wang
Yezhou Yang
Zicheng Liu
VLM
39
97
0
05 Apr 2021
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine
  Translation
Pruning-then-Expanding Model for Domain Adaptation of Neural Machine Translation
Shuhao Gu
Yang Feng
Wanying Xie
CLL
AI4CE
25
27
0
25 Mar 2021
Previous
12345
Next