Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1806.00187
Cited By
Scaling Neural Machine Translation
1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
AIMat
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Scaling Neural Machine Translation"
50 / 379 papers shown
Title
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
32
100
0
08 Jun 2021
Self-supervised and Supervised Joint Training for Resource-rich Machine Translation
Yong Cheng
Wei Wang
Lu Jiang
Wolfgang Macherey
26
17
0
08 Jun 2021
On the Language Coverage Bias for Neural Machine Translation
Shuo Wang
Zhaopeng Tu
Zhixing Tan
Shuming Shi
Maosong Sun
Yang Liu
19
19
0
07 Jun 2021
Oriented Object Detection with Transformer
Teli Ma
Mingyuan Mao
Honghui Zheng
Peng Gao
Xiaodi Wang
Shumin Han
Errui Ding
Baochang Zhang
David Doermann
ViT
27
40
0
06 Jun 2021
Scalable Transformers for Neural Machine Translation
Peng Gao
Shijie Geng
Ping Luo
Xiaogang Wang
Jifeng Dai
Hongsheng Li
31
13
0
04 Jun 2021
Luna: Linear Unified Nested Attention
Xuezhe Ma
Xiang Kong
Sinong Wang
Chunting Zhou
Jonathan May
Hao Ma
Luke Zettlemoyer
33
114
0
03 Jun 2021
Cascaded Head-colliding Attention
Lin Zheng
Zhiyong Wu
Lingpeng Kong
27
2
0
31 May 2021
Investigating Code-Mixed Modern Standard Arabic-Egyptian to English Machine Translation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
MoE
25
11
0
28 May 2021
TranSmart: A Practical Interactive Machine Translation System
Guoping Huang
Lemao Liu
Xing Wang
Longyue Wang
Huayang Li
Zhaopeng Tu
Chengyang Huang
Shuming Shi
18
32
0
27 May 2021
Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets
Max Ryabinin
A. Malinin
Mark Gales
UQCV
20
18
0
14 May 2021
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
36
656
0
07 May 2021
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
35
183
0
29 Apr 2021
Multimodal Contrastive Training for Visual Representation Learning
Xin Yuan
Zhe-nan Lin
Jason Kuen
Jianming Zhang
Yilin Wang
Michael Maire
Ajinkya Kale
Baldo Faieta
SSL
30
153
0
26 Apr 2021
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Chia-Yu Chen
Jiamin Ni
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
...
Naigang Wang
Swagath Venkataramani
Vijayalakshmi Srinivasan
Wei Zhang
K. Gopalakrishnan
29
66
0
21 Apr 2021
How to Train BERT with an Academic Budget
Peter Izsak
Moshe Berchansky
Omer Levy
23
113
0
15 Apr 2021
Improving Gender Translation Accuracy with Filtered Self-Training
Prafulla Kumar Choubey
Anna Currey
Prashant Mathur
Georgiana Dinu
23
10
0
15 Apr 2021
Reward Optimization for Neural Machine Translation with Learned Metrics
Raphael Shu
Kang Min Yoo
Jung-Woo Ha
35
12
0
15 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
25
85
0
13 Apr 2021
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach
Simiao Zuo
Chen Liang
Haoming Jiang
Xiaodong Liu
Pengcheng He
Jianfeng Gao
Weizhu Chen
T. Zhao
55
9
0
11 Apr 2021
Better Neural Machine Translation by Extracting Linguistic Information from BERT
Hassan S. Shavarani
Anoop Sarkar
24
15
0
07 Apr 2021
Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions
Jumon Nozaki
Tatsuya Komatsu
20
71
0
06 Apr 2021
ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation
Bei Li
Quan Du
Tao Zhou
Shuhan Zhou
Xin Zeng
Tong Xiao
Jingbo Zhu
23
22
0
06 Apr 2021
Rethinking Perturbations in Encoder-Decoders for Fast Training
Sho Takase
Shun Kiyono
33
45
0
05 Apr 2021
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
14
93
0
26 Mar 2021
Mask Attention Networks: Rethinking and Strengthen Transformer
Zhihao Fan
Yeyun Gong
Dayiheng Liu
Zhongyu Wei
Siyuan Wang
Jian Jiao
Nan Duan
Ruofei Zhang
Xuanjing Huang
34
72
0
25 Mar 2021
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
44
63
0
24 Mar 2021
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
VLM
25
196
0
22 Mar 2021
Self-Learning for Zero Shot Neural Machine Translation
Surafel Melaku Lakew
Matteo Negri
Marco Turchi
21
1
0
10 Mar 2021
IOT: Instance-wise Layer Reordering for Transformer Structures
Jinhua Zhu
Lijun Wu
Yingce Xia
Shufang Xie
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
31
7
0
05 Mar 2021
An empirical analysis of phrase-based and neural machine translation
Hamidreza Ghader
29
1
0
04 Mar 2021
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
36
348
0
03 Mar 2021
OmniNet: Omnidirectional Representations from Transformers
Yi Tay
Mostafa Dehghani
V. Aribandi
Jai Gupta
Philip Pham
Zhen Qin
Dara Bahri
Da-Cheng Juan
Donald Metzler
47
26
0
01 Mar 2021
SparseBERT: Rethinking the Importance Analysis in Self-attention
Han Shi
Jiahui Gao
Xiaozhe Ren
Hang Xu
Xiaodan Liang
Zhenguo Li
James T. Kwok
23
54
0
25 Feb 2021
Linear Transformers Are Secretly Fast Weight Programmers
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
46
225
0
22 Feb 2021
Medical Transformer: Gated Axial-Attention for Medical Image Segmentation
Jeya Maria Jose Valanarasu
Poojan Oza
I. Hacihaliloglu
Vishal M. Patel
ViT
MedIm
28
963
0
21 Feb 2021
Searching for Search Errors in Neural Morphological Inflection
Martina Forster
Clara Meister
Ryan Cotterell
30
5
0
16 Feb 2021
Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
Shuai Zhang
RALM
33
51
0
15 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,989
0
09 Feb 2021
Few-Shot Domain Adaptation for Grammatical Error Correction via Meta-Learning
Shengsheng Zhang
Yaping Huang
Yun-Nung Chen
Liner Yang
Chencheng Wang
Erhong Yang
VLM
32
2
0
29 Jan 2021
Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation
Ye Liu
Yao Wan
Jianguo Zhang
Wenting Zhao
Philip S. Yu
25
23
0
22 Jan 2021
Fast offline Transformer-based end-to-end automatic speech recognition for real-world applications
Y. Oh
Kiyoung Park
Jeongue Park
OffRL
22
5
0
14 Jan 2021
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning
Xuebo Liu
Longyue Wang
Derek F. Wong
Liang Ding
Lidia S. Chao
Zhaopeng Tu
AI4CE
27
35
0
29 Dec 2020
Learning Light-Weight Translation Models from Deep Transformer
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
120
40
0
27 Dec 2020
Sub-Linear Memory: How to Make Performers SLiM
Valerii Likhosherstov
K. Choromanski
Jared Davis
Xingyou Song
Adrian Weller
23
19
0
21 Dec 2020
A Closer Look at the Robustness of Vision-and-Language Pre-trained Models
Linjie Li
Zhe Gan
Jingjing Liu
VLM
33
42
0
15 Dec 2020
Reciprocal Supervised Learning Improves Neural Machine Translation
Minkai Xu
Mingxuan Wang
Zhouhan Lin
Hao Zhou
Weinan Zhang
Lei Li
12
0
0
05 Dec 2020
GottBERT: a pure German Language Model
Raphael Scheible
Fabian Thomczyk
P. Tippmann
V. Jaravine
M. Boeker
VLM
19
76
0
03 Dec 2020
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering
Zhengbao Jiang
Jun Araki
Haibo Ding
Graham Neubig
UQCV
31
410
0
02 Dec 2020
Previous
1
2
3
4
5
6
7
8
Next