Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.00187
Cited By
Scaling Neural Machine Translation
1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Neural Machine Translation"
50 / 379 papers shown
Title
Norm-Based Curriculum Learning for Neural Machine Translation
Xuebo Liu
Houtim Lai
Derek F. Wong
Lidia S. Chao
25
118
0
03 Jun 2020
Is 42 the Answer to Everything in Subtitling-oriented Speech Translation?
Alina Karakanta
Matteo Negri
Marco Turchi
14
33
0
01 Jun 2020
BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements
Xiaoyi Chen
A. Salem
Dingfan Chen
Michael Backes
Shiqing Ma
Qingni Shen
Zhonghai Wu
Yang Zhang
SILM
29
228
0
01 Jun 2020
ADAHESSIAN: An Adaptive Second Order Optimizer for Machine Learning
Z. Yao
A. Gholami
Sheng Shen
Mustafa Mustafa
Kurt Keutzer
Michael W. Mahoney
ODL
39
275
0
01 Jun 2020
Unsupervised Quality Estimation for Neural Machine Translation
M. Fomicheva
Shuo Sun
Lisa Yankovskaya
Frédéric Blain
Francisco Guzmán
Mark Fishel
Nikolaos Aletras
Vishrav Chaudhary
Lucia Specia
UQLM
20
184
0
21 May 2020
BiQGEMM: Matrix Multiplication with Lookup Table For Binary-Coding-based Quantized DNNs
Yongkweon Jeon
Baeseong Park
S. Kwon
Byeongwook Kim
Jeongin Yun
Dongsoo Lee
MQ
33
30
0
20 May 2020
Dual Learning: Theoretical Study and an Algorithmic Extension
Zhibing Zhao
Yingce Xia
Tao Qin
Lirong Xia
Tie-Yan Liu
29
11
0
17 May 2020
Rethinking and Improving Natural Language Generation with Layer-Wise Multi-View Decoding
Fenglin Liu
Xuancheng Ren
Guangxiang Zhao
Chenyu You
Xuewei Ma
Xian Wu
Xu Sun
40
2
0
16 May 2020
A Mixture of
h
−
1
h-1
h
−
1
Heads is Better than
h
h
h
Heads
Hao Peng
Roy Schwartz
Dianqi Li
Noah A. Smith
MoE
27
32
0
13 May 2020
Listen Attentively, and Spell Once: Whole Sentence Generation via a Non-Autoregressive Architecture for Low-Latency Speech Recognition
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
Shuai Zhang
RALM
23
41
0
11 May 2020
schuBERT: Optimizing Elements of BERT
A. Khetan
Zohar Karnin
28
30
0
09 May 2020
It's Morphin' Time! Combating Linguistic Discrimination with Inflectional Perturbations
Samson Tan
Chenyu You
Min-Yen Kan
R. Socher
166
103
0
09 May 2020
Dynamically Adjusting Transformer Batch Size by Monitoring Gradient Direction Change
Hongfei Xu
Josef van Genabith
Deyi Xiong
Qiuhui Liu
14
10
0
05 May 2020
Learning an Unreferenced Metric for Online Dialogue Evaluation
Koustuv Sinha
Prasanna Parthasarathi
Jasmine Wang
Ryan J. Lowe
William L. Hamilton
Joelle Pineau
OffRL
29
84
0
01 May 2020
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li
Yen-Chun Chen
Yu Cheng
Zhe Gan
Licheng Yu
Jingjing Liu
MLLM
VLM
OffRL
AI4TS
46
493
0
01 May 2020
Mind Your Inflections! Improving NLP for Non-Standard Englishes with Base-Inflection Encoding
Samson Tan
Chenyu You
L. Varshney
Min-Yen Kan
17
34
0
30 Apr 2020
Multiscale Collaborative Deep Models for Neural Machine Translation
Xiangpeng Wei
Heng Yu
Yue Hu
Yue Zhang
Rongxiang Weng
Weihua Luo
27
28
0
29 Apr 2020
All Word Embeddings from One Embedding
Sho Takase
Sosuke Kobayashi
11
10
0
25 Apr 2020
Lite Transformer with Long-Short Range Attention
Zhanghao Wu
Zhijian Liu
Ji Lin
Yujun Lin
Song Han
23
318
0
24 Apr 2020
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck
Yang Zhao
Ping Yu
Suchismit Mahapatra
Qinliang Su
Changyou Chen
DRL
17
2
0
22 Apr 2020
Highway Transformer: Self-Gating Enhanced Self-Attentive Networks
Yekun Chai
Jin Shuo
Xinwen Hou
23
16
0
17 Apr 2020
Towards Automatic Generation of Questions from Long Answers
Shlok Kumar Mishra
Pranav Goel
Abhishek Sharma
Abhyuday N. Jagannatha
David Jacobs
Hal Daumé
28
8
0
10 Apr 2020
Translation Artifacts in Cross-lingual Transfer Learning
Mikel Artetxe
Gorka Labaka
Eneko Agirre
27
115
0
09 Apr 2020
Detecting and Understanding Generalization Barriers for Neural Machine Translation
Guanlin Li
Lemao Liu
Conghui Zhu
T. Zhao
Shuming Shi
28
0
0
05 Apr 2020
SAC: Accelerating and Structuring Self-Attention via Sparse Adaptive Connection
Xiaoya Li
Yuxian Meng
Mingxin Zhou
Qinghong Han
Fei Wu
Jiwei Li
27
20
0
22 Mar 2020
PowerNorm: Rethinking Batch Normalization in Transformers
Sheng Shen
Z. Yao
A. Gholami
Michael W. Mahoney
Kurt Keutzer
BDL
24
16
0
17 Mar 2020
Learning to Encode Position for Transformer with Continuous Dynamical Model
Xuanqing Liu
Hsiang-Fu Yu
Inderjit Dhillon
Cho-Jui Hsieh
16
107
0
13 Mar 2020
Meta-Embeddings Based On Self-Attention
Qichen Li
Xiaoke Jiang
Jun Xia
Jian Li
21
2
0
03 Mar 2020
Transformer++
Prakhar Thapak
P. Hore
6
0
0
02 Mar 2020
Do all Roads Lead to Rome? Understanding the Role of Initialization in Iterative Back-Translation
Mikel Artetxe
Gorka Labaka
Noe Casas
Eneko Agirre
LRM
29
5
0
28 Feb 2020
Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
Zhuohan Li
Eric Wallace
Sheng Shen
Kevin Lin
Kurt Keutzer
Dan Klein
Joseph E. Gonzalez
22
148
0
26 Feb 2020
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato
Yves Scherrer
Jörg Tiedemann
32
92
0
24 Feb 2020
Overlap Local-SGD: An Algorithmic Approach to Hide Communication Delays in Distributed SGD
Jianyu Wang
Hao Liang
Gauri Joshi
22
33
0
21 Feb 2020
Balancing Cost and Benefit with Tied-Multi Transformers
Raj Dabre
Raphaël Rubino
Atsushi Fujita
14
6
0
20 Feb 2020
Tree-structured Attention with Hierarchical Accumulation
Xuan-Phi Nguyen
Chenyu You
Guosheng Lin
R. Socher
4
76
0
19 Feb 2020
Uncertainty Estimation in Autoregressive Structured Prediction
A. Malinin
Mark Gales
UQLM
22
9
0
18 Feb 2020
Low-Rank Bottleneck in Multi-head Attention Models
Srinadh Bhojanapalli
Chulhee Yun
A. S. Rawat
Sashank J. Reddi
Sanjiv Kumar
24
94
0
17 Feb 2020
Incorporating BERT into Neural Machine Translation
Jinhua Zhu
Yingce Xia
Lijun Wu
Di He
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
FedML
AIMat
10
354
0
17 Feb 2020
FQuAD: French Question Answering Dataset
Martin d'Hoffschmidt
Wacim Belblidia
Tom Brendlé
Quentin Heinrich
Maxime Vidal
23
98
0
14 Feb 2020
Time-aware Large Kernel Convolutions
Vasileios Lioutas
Yuhong Guo
AI4TS
16
29
0
08 Feb 2020
Towards the Systematic Reporting of the Energy and Carbon Footprints of Machine Learning
Peter Henderson
Jie Hu
Joshua Romoff
Emma Brunskill
Dan Jurafsky
Joelle Pineau
25
437
0
31 Jan 2020
Normalization of Input-output Shared Embeddings in Text Generation Models
Jinyang Liu
Yujia Zhai
Zizhong Chen
25
0
0
22 Jan 2020
Non-Autoregressive Machine Translation with Disentangled Context Transformer
Jungo Kasai
James Cross
Marjan Ghazvininejad
Jiatao Gu
25
33
0
15 Jan 2020
Reformer: The Efficient Transformer
Nikita Kitaev
Lukasz Kaiser
Anselm Levskaya
VLM
40
2,258
0
13 Jan 2020
Explicit Sparse Transformer: Concentrated Attention Through Explicit Selection
Guangxiang Zhao
Junyang Lin
Zhiyuan Zhang
Xuancheng Ren
Qi Su
Xu Sun
22
108
0
25 Dec 2019
Tag-less Back-Translation
Idris Abdulmumin
B. Galadanci
Aliyu Dadan Garba
19
11
0
22 Dec 2019
Neural Machine Translation: A Review and Survey
Felix Stahlberg
3DV
AI4TS
MedIm
20
312
0
04 Dec 2019
Better Understanding Hierarchical Visual Relationship for Image Caption
Z. Fei
24
0
0
04 Dec 2019
Merging External Bilingual Pairs into Neural Machine Translation
Tao Wang
Shaohui Kuang
Deyi Xiong
António Branco
19
10
0
02 Dec 2019
Iterative Batch Back-Translation for Neural Machine Translation: A Conceptual Model
Idris Abdulmumin
B. Galadanci
Abubakar Isa
10
0
0
26 Nov 2019
Previous
1
2
3
4
5
6
7
8
Next