ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1911.02727
  4. Cited By
Understanding Knowledge Distillation in Non-autoregressive Machine
  Translation

Understanding Knowledge Distillation in Non-autoregressive Machine Translation

7 November 2019
Chunting Zhou
Graham Neubig
Jiatao Gu
ArXivPDFHTML

Papers citing "Understanding Knowledge Distillation in Non-autoregressive Machine Translation"

50 / 54 papers shown
Title
FourierNAT: A Fourier-Mixing-Based Non-Autoregressive Transformer for Parallel Sequence Generation
FourierNAT: A Fourier-Mixing-Based Non-Autoregressive Transformer for Parallel Sequence Generation
Andrew Kiruluta
Eric Lundy
Andreas Lemos
AI4TS
47
0
0
04 Mar 2025
Decoupled Sequence and Structure Generation for Realistic Antibody Design
Decoupled Sequence and Structure Generation for Realistic Antibody Design
Nayoung Kim
Minsu Kim
Sungsoo Ahn
Jinkyoo Park
54
0
0
20 Jan 2025
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Tsz Kin Lam
Marco Gaido
Sara Papi
L. Bentivogli
Barry Haddow
36
0
0
04 Jan 2025
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Transformers to SSMs: Distilling Quadratic Knowledge to Subquadratic Models
Aviv Bick
Kevin Y. Li
Eric P. Xing
J. Zico Kolter
Albert Gu
Mamba
58
24
0
19 Aug 2024
CTC-based Non-autoregressive Textless Speech-to-Speech Translation
CTC-based Non-autoregressive Textless Speech-to-Speech Translation
Qingkai Fang
Zhengrui Ma
Yan Zhou
Min Zhang
Yang Feng
52
0
0
11 Jun 2024
What Have We Achieved on Non-autoregressive Translation?
What Have We Achieved on Non-autoregressive Translation?
Yafu Li
Huajian Zhang
Jianhao Yan
Yongjing Yin
Yue Zhang
42
1
0
21 May 2024
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge
  Distillation
Sentence-Level or Token-Level? A Comprehensive Study on Knowledge Distillation
Jingxuan Wei
Linzhuang Sun
Yichong Leng
Xu Tan
Bihui Yu
Ruifeng Guo
51
3
0
23 Apr 2024
Non-autoregressive Sequence-to-Sequence Vision-Language Models
Non-autoregressive Sequence-to-Sequence Vision-Language Models
Kunyu Shi
Qi Dong
Luis Goncalves
Zhuowen Tu
Stefano Soatto
VLM
47
3
0
04 Mar 2024
Analysis of Levenshtein Transformer's Decoder and Its Variants
Analysis of Levenshtein Transformer's Decoder and Its Variants
Ruiyang Zhou
19
0
0
19 Feb 2024
Domain Adaptation of Multilingual Semantic Search -- Literature Review
Domain Adaptation of Multilingual Semantic Search -- Literature Review
Anna Bringmann
Anastasia Zhukova
VLM
43
0
0
05 Feb 2024
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Jiasheng Ye
Zaixiang Zheng
Yu Bao
Lihua Qian
Quanquan Gu
DiffM
54
14
0
23 Aug 2023
Improving Image Captioning Descriptiveness by Ranking and LLM-based
  Fusion
Improving Image Captioning Descriptiveness by Ranking and LLM-based Fusion
Simone Bianco
Luigi Celona
Marco Donzella
Paolo Napoletano
36
18
0
20 Jun 2023
Online Distillation for Pseudo-Relevance Feedback
Online Distillation for Pseudo-Relevance Feedback
Sean MacAvaney
Xi Wang
30
2
0
16 Jun 2023
Revisiting Non-Autoregressive Translation at Scale
Revisiting Non-Autoregressive Translation at Scale
Zhihao Wang
Longyue Wang
Jinsong Su
Junfeng Yao
Zhaopeng Tu
36
3
0
25 May 2023
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Multitask Prompt Tuning Enables Parameter-Efficient Transfer Learning
Zhen Wang
Yikang Shen
Leonid Karlinsky
Rogerio Feris
Huan Sun
Yoon Kim
VLM
VPVLM
44
108
0
06 Mar 2023
Harnessing Knowledge and Reasoning for Human-Like Natural Language
  Generation: A Brief Review
Harnessing Knowledge and Reasoning for Human-Like Natural Language Generation: A Brief Review
Jiangjie Chen
Yanghua Xiao
49
4
0
07 Dec 2022
Improving Simultaneous Machine Translation with Monolingual Data
Improving Simultaneous Machine Translation with Monolingual Data
Hexuan Deng
Liang Ding
Xuebo Liu
Meishan Zhang
Dacheng Tao
Min Zhang
35
12
0
02 Dec 2022
Multi-Granularity Optimization for Non-Autoregressive Translation
Multi-Granularity Optimization for Non-Autoregressive Translation
Yafu Li
Leyang Cui
Yongjing Yin
Yue Zhang
37
7
0
20 Oct 2022
A baseline revisited: Pushing the limits of multi-segment models for
  context-aware translation
A baseline revisited: Pushing the limits of multi-segment models for context-aware translation
Suvodeep Majumde
Stanislas Lauly
Maria Nadejde
Marcello Federico
Georgiana Dinu
38
13
0
19 Oct 2022
Model Criticism for Long-Form Text Generation
Model Criticism for Long-Form Text Generation
Yuntian Deng
Volodymyr Kuleshov
Alexander M. Rush
44
19
0
16 Oct 2022
CTC Alignments Improve Autoregressive Translation
CTC Alignments Improve Autoregressive Translation
Brian Yan
Siddharth Dalmia
Yosuke Higuchi
Graham Neubig
Florian Metze
A. Black
Shinji Watanabe
46
33
0
11 Oct 2022
Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive
  Machine Translation
Viterbi Decoding of Directed Acyclic Transformer for Non-Autoregressive Machine Translation
Chenze Shao
Zhengrui Ma
Yang Feng
42
14
0
11 Oct 2022
PROD: Progressive Distillation for Dense Retrieval
PROD: Progressive Distillation for Dense Retrieval
Zhenghao Lin
Yeyun Gong
Xiao Liu
Hang Zhang
Chen Lin
...
Jian Jiao
Jing Lu
Daxin Jiang
Rangan Majumder
Nan Duan
51
27
0
27 Sep 2022
Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
Non-Autoregressive Machine Translation: It's Not as Fast as it Seems
Jindvrich Helcl
Barry Haddow
Alexandra Birch
27
20
0
04 May 2022
$\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text
  Generation
latent\textit{latent}latent-GLAT: Glancing at Latent Variables for Parallel Text Generation
Yu Bao
Hao Zhou
Shujian Huang
Dongqi Wang
Lihua Qian
Xinyu Dai
Jiajun Chen
Lei Li
31
38
0
05 Apr 2022
PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction
  Transformer
PreTR: Spatio-Temporal Non-Autoregressive Trajectory Prediction Transformer
Lina Achaji
Thierno Barry
Thibault Fouqueray
Julien Moreau
François Aioun
François Charpillet
18
15
0
17 Mar 2022
Can Multilinguality benefit Non-autoregressive Machine Translation?
Can Multilinguality benefit Non-autoregressive Machine Translation?
Sweta Agrawal
Julia Kreutzer
Colin Cherry
AI4CE
29
1
0
16 Dec 2021
Towards More Efficient Insertion Transformer with Fractional Positional
  Encoding
Towards More Efficient Insertion Transformer with Fractional Positional Encoding
Zhisong Zhang
Yizhe Zhang
W. Dolan
49
0
0
12 Dec 2021
Multilingual AMR Parsing with Noisy Knowledge Distillation
Multilingual AMR Parsing with Noisy Knowledge Distillation
Deng Cai
Xin Li
Jackie Chun-Sing Ho
Lidong Bing
W. Lam
27
18
0
30 Sep 2021
Towards Reinforcement Learning for Pivot-based Neural Machine
  Translation with Non-autoregressive Transformer
Towards Reinforcement Learning for Pivot-based Neural Machine Translation with Non-autoregressive Transformer
Evgeniia Tokarchuk
Jan Rosendahl
Weiyue Wang
Pavel Petrushkov
Tomer Lancewicki
Shahram Khadivi
Hermann Ney
LRM
8
1
0
27 Sep 2021
Integrated Training for Sequence-to-Sequence Models Using
  Non-Autoregressive Transformer
Integrated Training for Sequence-to-Sequence Models Using Non-Autoregressive Transformer
Evgeniia Tokarchuk
Jan Rosendahl
Weiyue Wang
Pavel Petrushkov
Tomer Lancewicki
Shahram Khadivi
Hermann Ney
28
2
0
27 Sep 2021
Partial to Whole Knowledge Distillation: Progressive Distilling
  Decomposed Knowledge Boosts Student Better
Partial to Whole Knowledge Distillation: Progressive Distilling Decomposed Knowledge Boosts Student Better
Xuanyang Zhang
Xinming Zhang
Jian Sun
25
1
0
26 Sep 2021
Scaling Laws for Neural Machine Translation
Scaling Laws for Neural Machine Translation
Behrooz Ghorbani
Orhan Firat
Markus Freitag
Ankur Bapna
M. Krikun
Xavier Garcia
Ciprian Chelba
Colin Cherry
40
99
0
16 Sep 2021
AligNART: Non-autoregressive Neural Machine Translation by Jointly
  Learning to Estimate Alignment and Translate
AligNART: Non-autoregressive Neural Machine Translation by Jointly Learning to Estimate Alignment and Translate
Jongyoon Song
Sungwon Kim
Sungroh Yoon
74
37
0
14 Sep 2021
Non-autoregressive End-to-end Speech Translation with Parallel
  Autoregressive Rescoring
Non-autoregressive End-to-end Speech Translation with Parallel Autoregressive Rescoring
Hirofumi Inaguma
Yosuke Higuchi
Kevin Duh
Tatsuya Kawahara
Shinji Watanabe
63
11
0
09 Sep 2021
Learning Energy-Based Approximate Inference Networks for Structured
  Applications in NLP
Learning Energy-Based Approximate Inference Networks for Structured Applications in NLP
Lifu Tu
BDL
35
0
0
27 Aug 2021
MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive
  Machine Translation
MvSR-NAT: Multi-view Subset Regularization for Non-Autoregressive Machine Translation
Pan Xie
Zexian Li
Xiaohui Hu
34
11
0
19 Aug 2021
The USYD-JD Speech Translation System for IWSLT 2021
The USYD-JD Speech Translation System for IWSLT 2021
Liang Ding
Di Wu
Dacheng Tao
29
16
0
24 Jul 2021
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Improving Sequence-to-Sequence Pre-training via Sequence Span Rewriting
Wangchunshu Zhou
Tao Ge
Canwen Xu
Ke Xu
Furu Wei
LRM
16
15
0
02 Jan 2021
Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
Fully Non-autoregressive Neural Machine Translation: Tricks of the Trade
Jiatao Gu
X. Kong
31
135
0
31 Dec 2020
Neural Machine Translation: A Review of Methods, Resources, and Tools
Neural Machine Translation: A Review of Methods, Resources, and Tools
Zhixing Tan
Shuo Wang
Zonghan Yang
Gang Chen
Xuancheng Huang
Maosong Sun
Yang Liu
3DV
AI4TS
25
105
0
31 Dec 2020
Understanding and Improving Lexical Choice in Non-Autoregressive
  Translation
Understanding and Improving Lexical Choice in Non-Autoregressive Translation
Liang Ding
Longyue Wang
Xuebo Liu
Derek F. Wong
Dacheng Tao
Zhaopeng Tu
112
77
0
29 Dec 2020
Infusing Sequential Information into Conditional Masked Translation
  Model with Self-Review Mechanism
Infusing Sequential Information into Conditional Masked Translation Model with Self-Review Mechanism
Pan Xie
Zhi Cui
Preslav Nakov
Xiaohui Hu
Jianwei Cui
Bin Wang
154
9
0
19 Oct 2020
Lifelong Language Knowledge Distillation
Lifelong Language Knowledge Distillation
Yung-Sung Chuang
Shang-Yu Su
Yun-Nung Chen
KELM
CLL
27
49
0
05 Oct 2020
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine
  Translation
Deep Encoder, Shallow Decoder: Reevaluating Non-autoregressive Machine Translation
Jungo Kasai
Nikolaos Pappas
Hao Peng
James Cross
Noah A. Smith
41
134
0
18 Jun 2020
Knowledge Distillation: A Survey
Knowledge Distillation: A Survey
Jianping Gou
B. Yu
Stephen J. Maybank
Dacheng Tao
VLM
21
2,851
0
09 Jun 2020
Self-Distillation as Instance-Specific Label Smoothing
Self-Distillation as Instance-Specific Label Smoothing
Zhilu Zhang
M. Sabuncu
20
116
0
09 Jun 2020
An Overview of Neural Network Compression
An Overview of Neural Network Compression
James OÑeill
AI4CE
45
98
0
05 Jun 2020
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Imitation Attacks and Defenses for Black-box Machine Translation Systems
Eric Wallace
Mitchell Stern
D. Song
AAML
22
119
0
30 Apr 2020
Non-Autoregressive Machine Translation with Latent Alignments
Non-Autoregressive Machine Translation with Latent Alignments
Chitwan Saharia
William Chan
Saurabh Saxena
Mohammad Norouzi
19
157
0
16 Apr 2020
12
Next