Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1902.01382
Cited By
The FLoRes Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English
4 February 2019
Francisco Guzmán
Peng-Jen Chen
Myle Ott
J. Pino
Guillaume Lample
Philipp Koehn
Vishrav Chaudhary
MarcÁurelio Ranzato
Re-assign community
ArXiv
PDF
HTML
Papers citing
"The FLoRes Evaluation Datasets for Low-Resource Machine Translation: Nepali-English and Sinhala-English"
45 / 45 papers shown
Title
Baichuan 2: Open Large-scale Language Models
Ai Ming Yang
Bin Xiao
Bingning Wang
Borong Zhang
Ce Bian
...
Youxin Jiang
Yuchen Gao
Yupeng Zhang
Zenan Zhou
Zhiying Wu
ELM
LRM
77
710
0
19 Sep 2023
On the Off-Target Problem of Zero-Shot Multilingual Neural Machine Translation
Liang Chen
Shuming Ma
Dongdong Zhang
Furu Wei
Baobao Chang
20
5
0
18 May 2023
Language Model Tokenizers Introduce Unfairness Between Languages
Aleksandar Petrov
Emanuele La Malfa
Philip Torr
Adel Bibi
45
98
0
17 May 2023
Bilex Rx: Lexical Data Augmentation for Massively Multilingual Machine Translation
Alex Jones
Isaac Caswell
Ishan Saxena
Orhan Firat
23
9
0
27 Mar 2023
Understanding and Detecting Hallucinations in Neural Machine Translation via Model Introspection
Weijia Xu
Sweta Agrawal
Eleftheria Briakou
Marianna J. Martindale
Marine Carpuat
HILM
27
47
0
18 Jan 2023
Some Languages are More Equal than Others: Probing Deeper into the Linguistic Disparity in the NLP World
Surangika Ranathunga
Nisansa de Silva
50
35
0
16 Oct 2022
Multilingual Representation Distillation with Contrastive Learning
Weiting Tan
Kevin Heffernan
Holger Schwenk
Philipp Koehn
43
16
0
10 Oct 2022
Consistent Human Evaluation of Machine Translation across Language Pairs
Daniel Licht
Cynthia Gao
Janice Lam
Francisco Guzman
Mona T. Diab
Philipp Koehn
40
17
0
17 May 2022
Isomorphic Cross-lingual Embeddings for Low-Resource Languages
Sonal Sannigrahi
Jesse Read
35
1
0
28 Mar 2022
Improving English to Sinhala Neural Machine Translation using Part-of-Speech Tag
Ravinga Perera
Thilakshi Fonseka
Rashmini Naranpanawa
Uthayasanker Thayasivam
24
6
0
17 Feb 2022
DEEP: DEnoising Entity Pre-training for Neural Machine Translation
Junjie Hu
Hiroaki Hayashi
Kyunghyun Cho
Graham Neubig
AI4CE
27
21
0
14 Nov 2021
Quality Estimation Using Round-trip Translation with Sentence Embeddings
N. Crone
A. Power
John Weldon
24
3
0
31 Oct 2021
The Eval4NLP Shared Task on Explainable Quality Estimation: Overview and Results
M. Fomicheva
Piyawat Lertvittayakumjorn
Wei-Ye Zhao
Steffen Eger
Yang Gao
ELM
24
39
0
08 Oct 2021
AfroMT: Pretraining Strategies and Reproducible Benchmarks for Translation of 8 African Languages
Machel Reid
Junjie Hu
Graham Neubig
Y. Matsuo
77
31
0
10 Sep 2021
A Large-Scale Study of Machine Translation in the Turkic Languages
Jamshidbek Mirzakhalov
A. Babu
Duygu Ataman
S. Kariev
Francis M. Tyers
...
Esra Onal
Shaxnoza Pulatova
Ahsan Wahab
Orhan Firat
Sriram Chellappan
24
28
0
09 Sep 2021
Survey of Low-Resource Machine Translation
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
39
150
0
01 Sep 2021
Can Transformers Jump Around Right in Natural Language? Assessing Performance Transfer from SCAN
Rahma Chaabouni
Roberto Dessì
Eugene Kharitonov
35
20
0
03 Jul 2021
Neural Machine Translation for Low-Resource Languages: A Survey
Surangika Ranathunga
E. Lee
Marjana Prifti Skenduli
Ravi Shekhar
Mehreen Alam
Rishemjit Kaur
40
236
0
29 Jun 2021
Exploiting Parallel Corpora to Improve Multilingual Embedding based Document and Sentence Alignment
Dilan Sachintha
Lakmali Piyarathna
Charith Rajitha
Surangika Ranathunga
24
3
0
12 Jun 2021
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Naman Goyal
Cynthia Gao
Vishrav Chaudhary
Peng-Jen Chen
Guillaume Wenzek
Da Ju
Sanjan Krishnan
MarcÁurelio Ranzato
Francisco Guzman
Angela Fan
15
559
0
06 Jun 2021
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages
Gowtham Ramesh
Sumanth Doddapaneni
Aravinth Bheemaraj
Mayank Jobanputra
AK Raghavan
...
K. Deepak
Vivek Raghavan
Anoop Kunchukuttan
Pratyush Kumar
Mitesh Khapra
LRM
37
231
0
12 Apr 2021
Robust Experimentation in the Continuous Time Bandit Problem
Pasquale Antonante
27
0
0
31 Mar 2021
Improving the Lexical Ability of Pretrained Language Models for Unsupervised Neural Machine Translation
Alexandra Chronopoulou
Dario Stojanovski
Alexander Fraser
SSL
37
26
0
18 Mar 2021
The LMU Munich System for the WMT 2020 Unsupervised Machine Translation Shared Task
Alexandra Chronopoulou
Dario Stojanovski
Viktor Hangya
Alexander Fraser
37
5
0
25 Oct 2020
ChrEn: Cherokee-English Machine Translation for Endangered Language Revitalization
Shiyue Zhang
B. Frey
Joey Tianyi Zhou
38
28
0
09 Oct 2020
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages
Xavier Garcia
Aditya Siddhant
Orhan Firat
Ankur P. Parikh
30
31
0
23 Sep 2020
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Tahmid Hasan
Abhik Bhattacharjee
Kazi Samin Mubasshir
Masum Hasan
Madhusudan Basak
M. Rahman
Rifat Shahriyar
VLM
23
72
0
20 Sep 2020
Energy-Based Reranking: Improving Neural Machine Translation Using Energy-Based Models
Sumanta Bhattacharyya
Pedram Rooshenas
Subhajit Naskar
Simeng Sun
Mohit Iyyer
Andrew McCallum
37
57
0
20 Sep 2020
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
Alexandra Chronopoulou
Dario Stojanovski
Alexander Fraser
18
33
0
16 Sep 2020
On Learning Language-Invariant Representations for Universal Machine Translation
Hao Zhao
Junjie Hu
Andrej Risteski
43
8
0
11 Aug 2020
A Multilingual Parallel Corpora Collection Effort for Indian Languages
Shashank Siripragrada
Jerin Philip
Vinay P. Namboodiri
C. V. Jawahar
VLM
32
47
0
15 Jul 2020
TICO-19: the Translation Initiative for Covid-19
Antonios Anastasopoulos
A. Cattelan
Zi-Yi Dou
Marcello Federico
C. Federman
...
Mengmeng Niu
A. Oktem
Eric Paquin
G. Tang
Sylwia Tur
24
90
0
03 Jul 2020
Composed Fine-Tuning: Freezing Pre-Trained Denoising Autoencoders for Improved Generalization
Sang Michael Xie
Tengyu Ma
Percy Liang
35
13
0
29 Jun 2020
Unsupervised Translation of Programming Languages
Marie-Anne Lachaux
Baptiste Roziere
L. Chanussot
Guillaume Lample
45
409
0
05 Jun 2020
Recipes for Adapting Pre-trained Monolingual and Multilingual Models to Machine Translation
Asa Cooper Stickland
Xian Li
Marjan Ghazvininejad
36
44
0
30 Apr 2020
Exploiting Sentence Order in Document Alignment
Brian Thompson
Philipp Koehn
27
19
0
30 Apr 2020
When and Why is Unsupervised Neural Machine Translation Useless?
Yunsu Kim
Miguel Graça
Hermann Ney
SSL
25
70
0
22 Apr 2020
Multilingual Denoising Pre-training for Neural Machine Translation
Yinhan Liu
Jiatao Gu
Naman Goyal
Xian Li
Sergey Edunov
Marjan Ghazvininejad
M. Lewis
Luke Zettlemoyer
AI4CE
AIMat
52
1,773
0
22 Jan 2020
A Comprehensive Survey of Multilingual Neural Machine Translation
Raj Dabre
Chenhui Chu
Anoop Kunchukuttan
LRM
36
33
0
04 Jan 2020
CCAligned: A Massive Collection of Cross-Lingual Web-Document Pairs
Ahmed El-Kishky
Vishrav Chaudhary
Francisco Guzman
Philipp Koehn
22
198
0
10 Nov 2019
Low-Resource Corpus Filtering using Multilingual Sentence Embeddings
Vishrav Chaudhary
Y. Tang
Francisco Guzmán
Holger Schwenk
Philipp Koehn
34
78
0
20 Jun 2019
Generalized Data Augmentation for Low-Resource Translation
Mengzhou Xia
X. Kong
Antonios Anastasopoulos
Graham Neubig
18
119
0
10 Jun 2019
Effective Cross-lingual Transfer of Neural Machine Translation Models without Shared Vocabularies
Yunsu Kim
Yingbo Gao
Hermann Ney
VLM
24
88
0
14 May 2019
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
189
1,639
0
11 Oct 2017
Six Challenges for Neural Machine Translation
Philipp Koehn
Rebecca Knowles
AAML
AIMat
224
1,209
0
12 Jun 2017
1