ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1606.07947
  4. Cited By
Sequence-Level Knowledge Distillation

Sequence-Level Knowledge Distillation

25 June 2016
Yoon Kim
Alexander M. Rush
ArXivPDFHTML

Papers citing "Sequence-Level Knowledge Distillation"

44 / 244 papers shown
Title
On the Efficacy of Knowledge Distillation
On the Efficacy of Knowledge Distillation
Ligang He
Rui Mao
21
598
0
03 Oct 2019
Automatically Learning Data Augmentation Policies for Dialogue Tasks
Automatically Learning Data Augmentation Policies for Dialogue Tasks
Tong Niu
Joey Tianyi Zhou
21
39
0
27 Sep 2019
Hint-Based Training for Non-Autoregressive Machine Translation
Hint-Based Training for Non-Autoregressive Machine Translation
Zhuohan Li
Zi Lin
Di He
Fei Tian
Tao Qin
Liwei Wang
Tie-Yan Liu
31
72
0
15 Sep 2019
Recurrent Neural Networks: An Embedded Computing Perspective
Recurrent Neural Networks: An Embedded Computing Perspective
Nesma M. Rezk
M. Purnaprajna
Tomas Nordstrom
Z. Ul-Abdin
40
81
0
23 Jul 2019
Evaluating Explanation Without Ground Truth in Interpretable Machine
  Learning
Evaluating Explanation Without Ground Truth in Interpretable Machine Learning
Fan Yang
Mengnan Du
Xia Hu
XAI
ELM
27
66
0
16 Jul 2019
Learn Spelling from Teachers: Transferring Knowledge from Language
  Models to Sequence-to-Sequence Speech Recognition
Learn Spelling from Teachers: Transferring Knowledge from Language Models to Sequence-to-Sequence Speech Recognition
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
KELM
22
38
0
13 Jul 2019
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
BAM! Born-Again Multi-Task Networks for Natural Language Understanding
Kevin Clark
Minh-Thang Luong
Urvashi Khandelwal
Christopher D. Manning
Quoc V. Le
21
228
0
10 Jul 2019
Sharing Attention Weights for Fast Transformer
Sharing Attention Weights for Fast Transformer
Tong Xiao
Yinqiao Li
Jingbo Zhu
Zhengtao Yu
Tongran Liu
17
50
0
26 Jun 2019
Sequence Generation: From Both Sides to the Middle
Sequence Generation: From Both Sides to the Middle
Long Zhou
Jiajun Zhang
Chengqing Zong
Heng Yu
28
22
0
23 Jun 2019
Retrieving Sequential Information for Non-Autoregressive Neural Machine
  Translation
Retrieving Sequential Information for Non-Autoregressive Neural Machine Translation
Chenze Shao
Yang Feng
Jinchao Zhang
Fandong Meng
Xilin Chen
Jie Zhou
24
42
0
22 Jun 2019
Tagged Back-Translation
Tagged Back-Translation
Isaac Caswell
Ciprian Chelba
David Grangier
24
218
0
15 Jun 2019
Scalable Syntax-Aware Language Models Using Knowledge Distillation
Scalable Syntax-Aware Language Models Using Knowledge Distillation
A. Kuncoro
Chris Dyer
Laura Rimell
S. Clark
Phil Blunsom
35
26
0
14 Jun 2019
Unified Semantic Parsing with Weak Supervision
Unified Semantic Parsing with Weak Supervision
Priyanka Agrawal
Parag Jain
Ayushi Dalmia
Abhishek Bansal
Ashish R. Mittal
Karthik Sankaranarayanan
36
10
0
12 Jun 2019
KERMIT: Generative Insertion-Based Modeling for Sequences
KERMIT: Generative Insertion-Based Modeling for Sequences
William Chan
Nikita Kitaev
Kelvin Guu
Mitchell Stern
Jakob Uszkoreit
VLM
23
65
0
04 Jun 2019
Levenshtein Transformer
Levenshtein Transformer
Jiatao Gu
Changhan Wang
Jake Zhao
49
359
0
27 May 2019
Conditional Teacher-Student Learning
Conditional Teacher-Student Learning
Zhong Meng
Jinyu Li
Yong Zhao
Jiawei Liu
22
90
0
28 Apr 2019
TextKD-GAN: Text Generation using KnowledgeDistillation and Generative
  Adversarial Networks
TextKD-GAN: Text Generation using KnowledgeDistillation and Generative Adversarial Networks
Md. Akmal Haidar
Mehdi Rezagholizadeh
34
52
0
23 Apr 2019
End-to-End Speech Translation with Knowledge Distillation
End-to-End Speech Translation with Knowledge Distillation
Yuchen Liu
Hao Xiong
Zhongjun He
Jiajun Zhang
Hua Wu
Haifeng Wang
Chengqing Zong
32
151
0
17 Apr 2019
Probability density distillation with generative adversarial networks
  for high-quality parallel waveform generation
Probability density distillation with generative adversarial networks for high-quality parallel waveform generation
Ryuichi Yamamoto
Eunwoo Song
Jae-Min Kim
19
55
0
09 Apr 2019
Benchmarking Approximate Inference Methods for Neural Structured
  Prediction
Benchmarking Approximate Inference Methods for Neural Structured Prediction
Lifu Tu
Kevin Gimpel
BDL
33
17
0
01 Apr 2019
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Distilling Task-Specific Knowledge from BERT into Simple Neural Networks
Raphael Tang
Yao Lu
Linqing Liu
Lili Mou
Olga Vechtomova
Jimmy J. Lin
32
417
0
28 Mar 2019
Multilingual Neural Machine Translation with Knowledge Distillation
Multilingual Neural Machine Translation with Knowledge Distillation
Xu Tan
Yi Ren
Di He
Tao Qin
Zhou Zhao
Tie-Yan Liu
20
248
0
27 Feb 2019
Non-Autoregressive Machine Translation with Auxiliary Regularization
Non-Autoregressive Machine Translation with Auxiliary Regularization
Yiren Wang
Fei Tian
Di He
Tao Qin
ChengXiang Zhai
Tie-Yan Liu
16
158
0
22 Feb 2019
Non-Autoregressive Neural Machine Translation with Enhanced Decoder
  Input
Non-Autoregressive Neural Machine Translation with Enhanced Decoder Input
Junliang Guo
Xu Tan
Di He
Tao Qin
Linli Xu
Tie-Yan Liu
16
125
0
23 Dec 2018
Sequence-Level Knowledge Distillation for Model Compression of
  Attention-based Sequence-to-Sequence Speech Recognition
Sequence-Level Knowledge Distillation for Model Compression of Attention-based Sequence-to-Sequence Speech Recognition
Raden Muáz Muním
Nakamasa Inoue
Koichi Shinoda
22
25
0
12 Nov 2018
Language-Independent Representor for Neural Machine Translation
Language-Independent Representor for Neural Machine Translation
Long Zhou
Yuchen Liu
Jiajun Zhang
Chengqing Zong
Guoping Huang
17
1
0
01 Nov 2018
Multi-Source Neural Machine Translation with Data Augmentation
Multi-Source Neural Machine Translation with Data Augmentation
Yuta Nishimura
Katsuhito Sudoh
Graham Neubig
Satoshi Nakamura
17
20
0
16 Oct 2018
Semi-Supervised Sequence Modeling with Cross-View Training
Semi-Supervised Sequence Modeling with Cross-View Training
Kevin Clark
Minh-Thang Luong
Christopher D. Manning
Quoc V. Le
SSL
11
333
0
22 Sep 2018
Ranking Distillation: Learning Compact Ranking Models With High
  Performance for Recommender System
Ranking Distillation: Learning Compact Ranking Models With High Performance for Recommender System
Jiaxi Tang
Ke Wang
27
182
0
19 Sep 2018
Attention-Guided Answer Distillation for Machine Reading Comprehension
Attention-Guided Answer Distillation for Machine Reading Comprehension
Minghao Hu
Yuxing Peng
Furu Wei
Zhen Huang
Dongsheng Li
Nan Yang
M. Zhou
FaML
23
75
0
23 Aug 2018
Findings of the Second Workshop on Neural Machine Translation and
  Generation
Findings of the Second Workshop on Neural Machine Translation and Generation
Alexandra Birch
A. Finch
Minh-Thang Luong
Graham Neubig
Yusuke Oda
DRL
31
12
0
08 Jun 2018
Distilling Knowledge for Search-based Structured Prediction
Distilling Knowledge for Search-based Structured Prediction
Yijia Liu
Wanxiang Che
Huaipeng Zhao
Bing Qin
Ting Liu
27
22
0
29 May 2018
Theory and Experiments on Vector Quantized Autoencoders
Theory and Experiments on Vector Quantized Autoencoders
Aurko Roy
Ashish Vaswani
Arvind Neelakantan
Niki Parmar
11
85
0
28 May 2018
Triangular Architecture for Rare Language Translation
Triangular Architecture for Rare Language Translation
Shuo Ren
Wenhu Chen
Shujie Liu
Mu Li
M. Zhou
Shuai Ma
26
33
0
13 May 2018
Born Again Neural Networks
Born Again Neural Networks
Tommaso Furlanello
Zachary Chase Lipton
Michael Tschannen
Laurent Itti
Anima Anandkumar
36
1,020
0
12 May 2018
Parsing Tweets into Universal Dependencies
Parsing Tweets into Universal Dependencies
Yijia Liu
Yi Zhu
Wanxiang Che
Bing Qin
Nathan Schneider
Noah A. Smith
16
74
0
23 Apr 2018
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative
  Refinement
Deterministic Non-Autoregressive Neural Sequence Modeling by Iterative Refinement
Jason D. Lee
Elman Mansimov
Kyunghyun Cho
DiffM
BDL
30
455
0
19 Feb 2018
A Teacher-Student Framework for Zero-Resource Neural Machine Translation
A Teacher-Student Framework for Zero-Resource Neural Machine Translation
Yun Chen
Yang Liu
Yong Cheng
V. Li
35
147
0
02 May 2017
Boosting Neural Machine Translation
Boosting Neural Machine Translation
Dakun Zhang
Jungi Kim
Josep Crego
Jean Senellart
AI4CE
23
26
0
19 Dec 2016
Scalable Bayesian Learning of Recurrent Neural Networks for Language
  Modeling
Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling
Zhe Gan
Chunyuan Li
Changyou Chen
Yunchen Pu
Qinliang Su
Lawrence Carin
BDL
UQCV
53
41
0
23 Nov 2016
SYSTRAN's Pure Neural Machine Translation Systems
SYSTRAN's Pure Neural Machine Translation Systems
Josep Crego
Jungi Kim
Guillaume Klein
Anabel Rebollo
Kathy Yang
...
Bo Wang
Jin Yang
Dakun Zhang
Jing Zhou
Peter Zoldan
36
125
0
18 Oct 2016
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
Distilling an Ensemble of Greedy Dependency Parsers into One MST Parser
A. Kuncoro
Miguel Ballesteros
Lingpeng Kong
Chris Dyer
Noah A. Smith
MoE
23
77
0
24 Sep 2016
Effective Approaches to Attention-based Neural Machine Translation
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
Previous
12345