ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.00187
  4. Cited By
Scaling Neural Machine Translation

Scaling Neural Machine Translation

1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
    AIMat
ArXivPDFHTML

Papers citing "Scaling Neural Machine Translation"

50 / 379 papers shown
Title
Norm-Based Curriculum Learning for Neural Machine Translation
Norm-Based Curriculum Learning for Neural Machine Translation
Xuebo Liu
Houtim Lai
Derek F. Wong
Lidia S. Chao
25
118
0
03 Jun 2020
Is 42 the Answer to Everything in Subtitling-oriented Speech
  Translation?
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
Alina Karakanta
Matteo Negri
Marco Turchi
14
33
0
01 Jun 2020
BadNL: Backdoor Attacks against NLP Models with Semantic-preserving
  Improvements
BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements
Xiaoyi Chen
A. Salem
Dingfan Chen
Michael Backes
Shiqing Ma
Qingni Shen
Zhonghai Wu
Yang Zhang
SILM
29
228
0
01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
39
275
0
01 Jun 2020
Unsupervised Quality Estimation for Neural Machine Translation
Unsupervised Quality Estimation for Neural Machine Translation
M. Fomicheva
Shuo Sun
Lisa Yankovskaya
Frédéric Blain
Francisco Guzmán
Mark Fishel
Nikolaos Aletras
Vishrav Chaudhary
Lucia Specia
UQLM
20
184
0
21 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based
  Quantized DNNs
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
33
30
0
20 May 2020
Dual Learning: Theoretical Study and an Algorithmic Extension
Dual Learning: Theoretical Study and an Algorithmic Extension
Zhibing Zhao
Yingce Xia
Tao Qin
Lirong Xia
Tie-Yan Liu
29
11
0
17 May 2020
Rethinking and Improving Natural Language Generation with Layer-Wise
  Multi-View Decoding
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding
Fenglin Liu
Xuancheng Ren
Guangxiang Zhao
Chenyu You
Xuewei Ma
Xian Wu
Xu Sun
40
2
0
16 May 2020
A Mixture of $h-1$ Heads is Better than $h$ Heads
A Mixture of h−1h-1h−1 Heads is Better than hhh Heads
Hao Peng
Roy Schwartz
Dianqi Li
Noah A. Smith
MoE
27
32
0
13 May 2020
Listen Attentively, and Spell Once: Whole Sentence Generation via a
  Non-Autoregressive Architecture for Low-Latency Speech Recognition
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
Shuai Zhang
RALM
23
41
0
11 May 2020
schuBERT: Optimizing Elements of BERT
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar Karnin
28
30
0
09 May 2020
It's Morphin' Time! Combating Linguistic Discrimination with
  Inflectional Perturbations
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
Samson Tan
Chenyu You
Min-Yen Kan
R. Socher
166
103
0
09 May 2020
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient
  Direction Change
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Hongfei Xu
Josef van Genabith
Deyi Xiong
Qiuhui Liu
14
10
0
05 May 2020
Learning an Unreferenced Metric for Online Dialogue Evaluation
Learning an Unreferenced Metric for Online Dialogue Evaluation
Koustuv Sinha
Prasanna Parthasarathi
Jasmine Wang
Ryan J. Lowe
William L. Hamilton
Joelle Pineau
OffRL
29
84
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation
  Pre-training
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
46
493
0
01 May 2020
Mind Your Inflections! Improving NLP for Non-Standard Englishes with
  Base-Inflection Encoding
Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding
Samson Tan
Chenyu You
L. Varshney
Min-Yen Kan
17
34
0
30 Apr 2020
Multiscale Collaborative Deep Models for Neural Machine Translation
Multiscale Collaborative Deep Models for Neural Machine Translation
Xiangpeng Wei
Heng Yu
Yue Hu
Yue Zhang
Rongxiang Weng
Weihua Luo
27
28
0
29 Apr 2020
All Word Embeddings from One Embedding
All Word Embeddings from One Embedding
Sho Takase
Sosuke Kobayashi
11
10
0
25 Apr 2020
Lite Transformer with Long-Short Range Attention
Lite Transformer with Long-Short Range Attention
Zhanghao Wu
Zhijian Liu
Ji Lin
Yujun Lin
Song Han
23
318
0
24 Apr 2020
Improve Variational Autoencoder for Text Generationwith Discrete Latent
  Bottleneck
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck
Yang Zhao
Ping Yu
Suchismit Mahapatra
Qinliang Su
Changyou Chen
DRL
17
2
0
22 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Towards Automatic Generation of Questions from Long Answers
Towards Automatic Generation of Questions from Long Answers
Shlok Kumar Mishra
Pranav Goel
Abhishek Sharma
Abhyuday N. Jagannatha
David Jacobs
Hal Daumé
28
8
0
10 Apr 2020
Translation Artifacts in Cross-lingual Transfer Learning
Translation Artifacts in Cross-lingual Transfer Learning
Mikel Artetxe
Gorka Labaka
Eneko Agirre
27
115
0
09 Apr 2020
Detecting and Understanding Generalization Barriers for Neural Machine
  Translation
Detecting and Understanding Generalization Barriers for Neural Machine Translation
Guanlin Li
Lemao Liu
Conghui Zhu
T. Zhao
Shuming Shi
28
0
0
05 Apr 2020
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive
  Connection
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
Xiaoya Li
Yuxian Meng
Mingxin Zhou
Qinghong Han
Fei Wu
Jiwei Li
27
20
0
22 Mar 2020
PowerNorm: Rethinking Batch Normalization in Transformers
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
24
16
0
17 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical
  Model
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
Meta-Embeddings Based On Self-Attention
Meta-Embeddings Based On Self-Attention
Qichen Li
Xiaoke Jiang
Jun Xia
Jian Li
21
2
0
03 Mar 2020
Transformer++
Transformer++
Prakhar Thapak
P. Hore
6
0
0
02 Mar 2020
Do all Roads Lead to Rome? Understanding the Role of Initialization in
  Iterative Back-Translation
Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation
Mikel Artetxe
Gorka Labaka
Noe Casas
Eneko Agirre
LRM
29
5
0
28 Feb 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training
  and Inference of Transformers
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
22
148
0
26 Feb 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine
  Translation
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
32
92
0
24 Feb 2020
Overlap Local-SGD: An Algorithmic Approach to Hide Communication Delays
  in Distributed SGD
Overlap Local-SGD: An Algorithmic Approach to Hide Communication Delays in Distributed SGD
Jianyu Wang
Hao Liang
Gauri Joshi
22
33
0
21 Feb 2020
Balancing Cost and Benefit with Tied-Multi Transformers
Balancing Cost and Benefit with Tied-Multi Transformers
Raj Dabre
Raphaël Rubino
Atsushi Fujita
14
6
0
20 Feb 2020
Tree-structured Attention with Hierarchical Accumulation
Tree-structured Attention with Hierarchical Accumulation
Xuan-Phi Nguyen
Chenyu You
Guosheng Lin
R. Socher
4
76
0
19 Feb 2020
Uncertainty Estimation in Autoregressive Structured Prediction
Uncertainty Estimation in Autoregressive Structured Prediction
A. Malinin
Mark Gales
UQLM
22
9
0
18 Feb 2020
Low-Rank Bottleneck in Multi-head Attention Models
Low-Rank Bottleneck in Multi-head Attention Models
Srinadh Bhojanapalli
Chulhee Yun
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
24
94
0
17 Feb 2020
Incorporating BERT into Neural Machine Translation
Incorporating BERT into Neural Machine Translation
Jinhua Zhu
Yingce Xia
Lijun Wu
Di He
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
FedML
AIMat
10
354
0
17 Feb 2020
FQuAD: French Question Answering Dataset
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
23
98
0
14 Feb 2020
Time-aware Large Kernel Convolutions
Time-aware Large Kernel Convolutions
Vasileios Lioutas
Yuhong Guo
AI4TS
16
29
0
08 Feb 2020
Towards the Systematic Reporting of the Energy and Carbon Footprints of
  Machine Learning
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning
Peter Henderson
Jie Hu
Joshua Romoff
Emma Brunskill
Dan Jurafsky
Joelle Pineau
25
437
0
31 Jan 2020
Normalization of Input-output Shared Embeddings in Text Generation
  Models
Normalization of Input-output Shared Embeddings in Text Generation Models
Jinyang Liu
Yujia Zhai
Zizhong Chen
25
0
0
22 Jan 2020
Non-Autoregressive Machine Translation with Disentangled Context
  Transformer
Non-Autoregressive Machine Translation with Disentangled Context Transformer
Jungo Kasai
James Cross
Marjan Ghazvininejad
Jiatao Gu
25
33
0
15 Jan 2020
Reformer: The Efficient Transformer
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
40
2,258
0
13 Jan 2020
Explicit Sparse Transformer: Concentrated Attention Through Explicit
  Selection
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Tag-less Back-Translation
Tag-less Back-Translation
Idris Abdulmumin
B. Galadanci
Aliyu Dadan Garba
19
11
0
22 Dec 2019
Neural Machine Translation: A Review and Survey
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
20
312
0
04 Dec 2019
Better Understanding Hierarchical Visual Relationship for Image Caption
Better Understanding Hierarchical Visual Relationship for Image Caption
Z. Fei
24
0
0
04 Dec 2019
Merging External Bilingual Pairs into Neural Machine Translation
Merging External Bilingual Pairs into Neural Machine Translation
Tao Wang
Shaohui Kuang
Deyi Xiong
António Branco
19
10
0
02 Dec 2019
Iterative Batch Back-Translation for Neural Machine Translation: A Conceptual Model
Idris Abdulmumin
B. Galadanci
Abubakar Isa
10
0
0
26 Nov 2019
Previous
12345678
Next