ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 1806.00187
  4. Cited By
Scaling Neural Machine Translation

Scaling Neural Machine Translation

1 June 2018
Myle Ott
Sergey Edunov
David Grangier
Michael Auli
    AIMat
ArXivPDFHTML

Papers citing "Scaling Neural Machine Translation"

50 / 379 papers shown
Title
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding
  Evaluation
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li
Jie Lei
Zhe Gan
Licheng Yu
Yen-Chun Chen
...
Tamara L. Berg
Joey Tianyi Zhou
Jingjing Liu
Lijuan Wang
Zicheng Liu
VLM
32
100
0
08 Jun 2021
Self-supervised and Supervised Joint Training for Resource-rich Machine
  Translation
Self-supervised and Supervised Joint Training for Resource-rich Machine Translation
Yong Cheng
Wei Wang
Lu Jiang
Wolfgang Macherey
26
17
0
08 Jun 2021
On the Language Coverage Bias for Neural Machine Translation
On the Language Coverage Bias for Neural Machine Translation
Shuo Wang
Zhaopeng Tu
Zhixing Tan
Shuming Shi
Maosong Sun
Yang Liu
19
19
0
07 Jun 2021
Oriented Object Detection with Transformer
Oriented Object Detection with Transformer
Teli Ma
Mingyuan Mao
Honghui Zheng
Peng Gao
Xiaodi Wang
Shumin Han
Errui Ding
Baochang Zhang
David Doermann
ViT
27
40
0
06 Jun 2021
Scalable Transformers for Neural Machine Translation
Scalable Transformers for Neural Machine Translation
Peng Gao
Shijie Geng
Ping Luo
Xiaogang Wang
Jifeng Dai
Hongsheng Li
31
13
0
04 Jun 2021
Luna: Linear Unified Nested Attention
Luna: Linear Unified Nested Attention
Xuezhe Ma
Xiang Kong
Sinong Wang
Chunting Zhou
Jonathan May
Hao Ma
Luke Zettlemoyer
33
114
0
03 Jun 2021
Cascaded Head-colliding Attention
Cascaded Head-colliding Attention
Lin Zheng
Zhiyong Wu
Lingpeng Kong
27
2
0
31 May 2021
Investigating Code-Mixed Modern Standard Arabic-Egyptian to English
  Machine Translation
Investigating Code-Mixed Modern Standard Arabic-Egyptian to English Machine Translation
El Moatez Billah Nagoudi
AbdelRahim Elmadany
Muhammad Abdul-Mageed
MoE
25
11
0
28 May 2021
TranSmart: A Practical Interactive Machine Translation System
TranSmart: A Practical Interactive Machine Translation System
Guoping Huang
Lemao Liu
Xing Wang
Longyue Wang
Huayang Li
Zhaopeng Tu
Chengyang Huang
Shuming Shi
18
32
0
27 May 2021
Scaling Ensemble Distribution Distillation to Many Classes with Proxy
  Targets
Scaling Ensemble Distribution Distillation to Many Classes with Proxy Targets
Max Ryabinin
A. Malinin
Mark Gales
UQCV
20
18
0
14 May 2021
ResMLP: Feedforward networks for image classification with
  data-efficient training
ResMLP: Feedforward networks for image classification with data-efficient training
Hugo Touvron
Piotr Bojanowski
Mathilde Caron
Matthieu Cord
Alaaeldin El-Nouby
...
Gautier Izacard
Armand Joulin
Gabriel Synnaeve
Jakob Verbeek
Hervé Jégou
VLM
36
656
0
07 May 2021
Entailment as Few-Shot Learner
Entailment as Few-Shot Learner
Sinong Wang
Han Fang
Madian Khabsa
Hanzi Mao
Hao Ma
35
183
0
29 Apr 2021
Multimodal Contrastive Training for Visual Representation Learning
Multimodal Contrastive Training for Visual Representation Learning
Xin Yuan
Zhe-nan Lin
Jason Kuen
Jianming Zhang
Yilin Wang
Michael Maire
Ajinkya Kale
Baldo Faieta
SSL
30
153
0
26 Apr 2021
ScaleCom: Scalable Sparsified Gradient Compression for
  Communication-Efficient Distributed Training
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training
Chia-Yu Chen
Jiamin Ni
Songtao Lu
Xiaodong Cui
Pin-Yu Chen
...
Naigang Wang
Swagath Venkataramani
Vijayalakshmi Srinivasan
Wei Zhang
K. Gopalakrishnan
29
66
0
21 Apr 2021
How to Train BERT with an Academic Budget
How to Train BERT with an Academic Budget
Peter Izsak
Moshe Berchansky
Omer Levy
23
113
0
15 Apr 2021
Improving Gender Translation Accuracy with Filtered Self-Training
Improving Gender Translation Accuracy with Filtered Self-Training
Prafulla Kumar Choubey
Anna Currey
Prashant Mathur
Georgiana Dinu
23
10
0
15 Apr 2021
Reward Optimization for Neural Machine Translation with Learned Metrics
Reward Optimization for Neural Machine Translation with Learned Metrics
Raphael Shu
Kang Min Yoo
Jung-Woo Ha
35
12
0
15 Apr 2021
Lessons on Parameter Sharing across Layers in Transformers
Lessons on Parameter Sharing across Layers in Transformers
Sho Takase
Shun Kiyono
25
85
0
13 Apr 2021
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization
  Approach
Adversarial Regularization as Stackelberg Game: An Unrolled Optimization Approach
Simiao Zuo
Chen Liang
Haoming Jiang
Xiaodong Liu
Pengcheng He
Jianfeng Gao
Weizhu Chen
T. Zhao
55
9
0
11 Apr 2021
Better Neural Machine Translation by Extracting Linguistic Information
  from BERT
Better Neural Machine Translation by Extracting Linguistic Information from BERT
Hassan S. Shavarani
Anoop Sarkar
24
15
0
07 Apr 2021
Relaxing the Conditional Independence Assumption of CTC-based ASR by
  Conditioning on Intermediate Predictions
Relaxing the Conditional Independence Assumption of CTC-based ASR by Conditioning on Intermediate Predictions
Jumon Nozaki
Tatsuya Komatsu
20
71
0
06 Apr 2021
ODE Transformer: An Ordinary Differential Equation-Inspired Model for
  Neural Machine Translation
ODE Transformer: An Ordinary Differential Equation-Inspired Model for Neural Machine Translation
Bei Li
Quan Du
Tao Zhou
Shuhan Zhou
Xin Zeng
Tong Xiao
Jingbo Zhu
23
22
0
06 Apr 2021
Rethinking Perturbations in Encoder-Decoders for Fast Training
Rethinking Perturbations in Encoder-Decoders for Fast Training
Sho Takase
Shun Kiyono
33
45
0
05 Apr 2021
A Practical Survey on Faster and Lighter Transformers
A Practical Survey on Faster and Lighter Transformers
Quentin Fournier
G. Caron
Daniel Aloise
14
93
0
26 Mar 2021
Mask Attention Networks: Rethinking and Strengthen Transformer
Mask Attention Networks: Rethinking and Strengthen Transformer
Zhihao Fan
Yeyun Gong
Dayiheng Liu
Zhongyu Wei
Siyuan Wang
Jian Jiao
Nan Duan
Ruofei Zhang
Xuanjing Huang
34
72
0
25 Mar 2021
Finetuning Pretrained Transformers into RNNs
Finetuning Pretrained Transformers into RNNs
Jungo Kasai
Hao Peng
Yizhe Zhang
Dani Yogatama
Gabriel Ilharco
Nikolaos Pappas
Yi Mao
Weizhu Chen
Noah A. Smith
44
63
0
24 Mar 2021
BERT: A Review of Applications in Natural Language Processing and
  Understanding
BERT: A Review of Applications in Natural Language Processing and Understanding
M. V. Koroteev
VLM
25
196
0
22 Mar 2021
Self-Learning for Zero Shot Neural Machine Translation
Self-Learning for Zero Shot Neural Machine Translation
Surafel Melaku Lakew
Matteo Negri
Marco Turchi
21
1
0
10 Mar 2021
IOT: Instance-wise Layer Reordering for Transformer Structures
IOT: Instance-wise Layer Reordering for Transformer Structures
Jinhua Zhu
Lijun Wu
Yingce Xia
Shufang Xie
Tao Qin
Wen-gang Zhou
Houqiang Li
Tie-Yan Liu
31
7
0
05 Mar 2021
An empirical analysis of phrase-based and neural machine translation
An empirical analysis of phrase-based and neural machine translation
Hamidreza Ghader
29
1
0
04 Mar 2021
Random Feature Attention
Random Feature Attention
Hao Peng
Nikolaos Pappas
Dani Yogatama
Roy Schwartz
Noah A. Smith
Lingpeng Kong
36
348
0
03 Mar 2021
OmniNet: Omnidirectional Representations from Transformers
OmniNet: Omnidirectional Representations from Transformers
Yi Tay
Mostafa Dehghani
V. Aribandi
Jai Gupta
Philip Pham
Zhen Qin
Dara Bahri
Da-Cheng Juan
Donald Metzler
47
26
0
01 Mar 2021
SparseBERT: Rethinking the Importance Analysis in Self-attention
SparseBERT: Rethinking the Importance Analysis in Self-attention
Han Shi
Jiahui Gao
Xiaozhe Ren
Hang Xu
Xiaodan Liang
Zhenguo Li
James T. Kwok
23
54
0
25 Feb 2021
Linear Transformers Are Secretly Fast Weight Programmers
Linear Transformers Are Secretly Fast Weight Programmers
Imanol Schlag
Kazuki Irie
Jürgen Schmidhuber
46
225
0
22 Feb 2021
Medical Transformer: Gated Axial-Attention for Medical Image
  Segmentation
Medical Transformer: Gated Axial-Attention for Medical Image Segmentation
Jeya Maria Jose Valanarasu
Poojan Oza
I. Hacihaliloglu
Vishal M. Patel
ViT
MedIm
28
963
0
21 Feb 2021
Searching for Search Errors in Neural Morphological Inflection
Searching for Search Errors in Neural Morphological Inflection
Martina Forster
Clara Meister
Ryan Cotterell
30
5
0
16 Feb 2021
Fast End-to-End Speech Recognition via Non-Autoregressive Models and
  Cross-Modal Knowledge Transferring from BERT
Fast End-to-End Speech Recognition via Non-Autoregressive Models and Cross-Modal Knowledge Transferring from BERT
Ye Bai
Jiangyan Yi
J. Tao
Zhengkun Tian
Zhengqi Wen
Shuai Zhang
RALM
33
51
0
15 Feb 2021
Is Space-Time Attention All You Need for Video Understanding?
Is Space-Time Attention All You Need for Video Understanding?
Gedas Bertasius
Heng Wang
Lorenzo Torresani
ViT
283
1,989
0
09 Feb 2021
Few-Shot Domain Adaptation for Grammatical Error Correction via
  Meta-Learning
Few-Shot Domain Adaptation for Grammatical Error Correction via Meta-Learning
Shengsheng Zhang
Yaping Huang
Yun-Nung Chen
Liner Yang
Chencheng Wang
Erhong Yang
VLM
32
2
0
29 Jan 2021
Enriching Non-Autoregressive Transformer with Syntactic and
  SemanticStructures for Neural Machine Translation
Enriching Non-Autoregressive Transformer with Syntactic and SemanticStructures for Neural Machine Translation
Ye Liu
Yao Wan
Jianguo Zhang
Wenting Zhao
Philip S. Yu
25
23
0
22 Jan 2021
Fast offline Transformer-based end-to-end automatic speech recognition
  for real-world applications
Fast offline Transformer-based end-to-end automatic speech recognition for real-world applications
Y. Oh
Kiyoung Park
Jeongue Park
OffRL
22
5
0
14 Jan 2021
Transformers in Vision: A Survey
Transformers in Vision: A Survey
Salman Khan
Muzammal Naseer
Munawar Hayat
Syed Waqas Zamir
Fahad Shahbaz Khan
M. Shah
ViT
227
2,431
0
04 Jan 2021
Reservoir Transformers
Reservoir Transformers
Sheng Shen
Alexei Baevski
Ari S. Morcos
Kurt Keutzer
Michael Auli
Douwe Kiela
35
17
0
30 Dec 2020
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence
  Learning
Understanding and Improving Encoder Layer Fusion in Sequence-to-Sequence Learning
Xuebo Liu
Longyue Wang
Derek F. Wong
Liang Ding
Lidia S. Chao
Zhaopeng Tu
AI4CE
27
35
0
29 Dec 2020
Learning Light-Weight Translation Models from Deep Transformer
Learning Light-Weight Translation Models from Deep Transformer
Bei Li
Ziyang Wang
Hui Liu
Quan Du
Tong Xiao
Chunliang Zhang
Jingbo Zhu
VLM
120
40
0
27 Dec 2020
Sub-Linear Memory: How to Make Performers SLiM
Sub-Linear Memory: How to Make Performers SLiM
Valerii Likhosherstov
K. Choromanski
Jared Davis
Xingyou Song
Adrian Weller
23
19
0
21 Dec 2020
A Closer Look at the Robustness of Vision-and-Language Pre-trained
  Models
A Closer Look at the Robustness of Vision-and-Language Pre-trained Models
Linjie Li
Zhe Gan
Jingjing Liu
VLM
33
42
0
15 Dec 2020
Reciprocal Supervised Learning Improves Neural Machine Translation
Reciprocal Supervised Learning Improves Neural Machine Translation
Minkai Xu
Mingxuan Wang
Zhouhan Lin
Hao Zhou
Weinan Zhang
Lei Li
12
0
0
05 Dec 2020
GottBERT: a pure German Language Model
GottBERT: a pure German Language Model
Raphael Scheible
Fabian Thomczyk
P. Tippmann
V. Jaravine
M. Boeker
VLM
19
76
0
03 Dec 2020
How Can We Know When Language Models Know? On the Calibration of
  Language Models for Question Answering
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering
Zhengbao Jiang
Jun Araki
Haibo Ding
Graham Neubig
UQCV
31
410
0
02 Dec 2020
Previous
12345678
Next