Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2207.04672
Cited By
v1
v2
v3 (latest)
No Language Left Behind: Scaling Human-Centered Machine Translation
11 July 2022
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
Kenneth Heafield
Kevin Heffernan
Elahe Kalbassi
Janice Lam
Daniel Licht
Jean Maillard
Anna Y. Sun
Skyler Wang
Guillaume Wenzek
Alison Youngblood
Bapi Akula
Loïc Barrault
Gabriel Mejia Gonzalez
Prangthip Hansanti
John Hoffman
Semarley Jarrett
Kaushik Ram Sadagopan
Dirk Rowe
Shannon L. Spruit
C. Tran
Pierre Yves Andrews
Necip Fazil Ayan
Shruti Bhosale
Sergey Edunov
Angela Fan
Cynthia Gao
Vedanuj Goswami
Francisco Guzmán
Philipp Koehn
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
MoE
Re-assign community
ArXiv (abs)
PDF
HTML
Github (31473★)
Papers citing
"No Language Left Behind: Scaling Human-Centered Machine Translation"
50 / 801 papers shown
Title
Multilingual Audio Captioning using machine translated data
Matéo Cousin
Etienne Labbé
Thomas Pellegrini
98
4
0
14 Sep 2023
SIB-200: A Simple, Inclusive, and Big Evaluation Dataset for Topic Classification in 200+ Languages and Dialects
David Ifeoluwa Adelani
Hannah Liu
Xiaoyu Shen
Nikita Vassilyev
Jesujoba Oluwadara Alabi
Yanke Mao
Haonan Gao
Annie En-Shiun Lee
ELM
102
80
0
14 Sep 2023
ChatGPT MT: Competitive for High- (but not Low-) Resource Languages
Nathaniel R. Robinson
Perez Ogayo
David R. Mortensen
Graham Neubig
54
32
0
14 Sep 2023
Overview of GUA-SPA at IberLEF 2023: Guarani-Spanish Code Switching Analysis
Luis Chiruzzo
Marvin Aguero-Torales
Gustavo A. Giménez-Lugo
Aldo Alvarez
Yliana Rodríguez
Santiago Góngora
Thamar Solorio
430
8
0
12 Sep 2023
BHASA: A Holistic Southeast Asian Linguistic and Cultural Evaluation Suite for Large Language Models
Wei Qi Leong
Jian Gang Ngui
Yosephine Susanto
Hamsawardhini Rengarajan
Kengatharaiyer Sarveswaran
William-Chandra Tjhi
71
9
0
12 Sep 2023
EPA: Easy Prompt Augmentation on Large Language Models via Multiple Sources and Multiple Targets
Hongyuan Lu
Wai Lam
67
1
0
09 Sep 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
134
141
0
09 Sep 2023
Gender-specific Machine Translation with Large Language Models
Eduardo Sánchez
Pierre Yves Andrews
Pontus Stenetorp
Mikel Artetxe
Marta R. Costa-jussá
65
3
0
06 Sep 2023
A deep Natural Language Inference predictor without language-specific training data
Lorenzo Corradi
Alessandro Manenti
Francesca Del Bonifro
Francesco Setti
D. Sorbo
28
0
0
06 Sep 2023
Automating Behavioral Testing in Machine Translation
Javier Ferrando
Matthias Sperber
Hendra Setiawan
Dominic Telaar
Savsa Hasan
54
3
0
05 Sep 2023
NLLB-CLIP -- train performant multilingual image retrieval model on a budget
Alexander Visheratin
VLM
127
19
0
04 Sep 2023
Let the Models Respond: Interpreting Language Model Detoxification Through the Lens of Prompt Dependence
Daniel Scalena
Gabriele Sarti
Malvina Nissim
Elisabetta Fersini
52
0
0
01 Sep 2023
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
Lucas Bandarkar
Davis Liang
Benjamin Muller
Mikel Artetxe
Satya Narayan Shukla
Don Husa
Naman Goyal
Abhinandan Krishnan
Luke Zettlemoyer
Madian Khabsa
126
157
0
31 Aug 2023
The Gender-GAP Pipeline: A Gender-Aware Polyglot Pipeline for Gender Characterisation in 55 Languages
Benjamin Muller
Belen Alastruey
Prangthip Hansanti
Elahe Kalbassi
C. Ropers
Eric Michael Smith
Adina Williams
Luke Zettlemoyer
Pierre Yves Andrews
Marta R. Costa-jussá
42
4
0
31 Aug 2023
Translate Meanings, Not Just Words: IdiomKB's Role in Optimizing Idiomatic Translation with Language Models
Shuang Li
Jiangjie Chen
Siyu Yuan
Xinyi Wu
Hao Yang
Shimin Tao
Yanghua Xiao
86
20
0
26 Aug 2023
Ngambay-French Neural Machine Translation (sba-Fr)
Sakayo Toadoum Sari
Angela Fan
Lema Logamou Seknewna
56
0
0
25 Aug 2023
Pre-gated MoE: An Algorithm-System Co-Design for Fast and Scalable Mixture-of-Expert Inference
Ranggi Hwang
Jianyu Wei
Shijie Cao
Changho Hwang
Xiaohu Tang
Ting Cao
Mao Yang
MoE
120
46
0
23 Aug 2023
SONAR: Sentence-Level Multimodal and Language-Agnostic Representations
Paul-Ambroise Duquenne
Holger Schwenk
Benoît Sagot
AI4TS
VLM
118
71
0
22 Aug 2023
NaijaRC: A Multi-choice Reading Comprehension Dataset for Nigerian Languages
Anuoluwapo Aremu
Jesujoba Oluwadara Alabi
Daud Abolade
Nkechinyere F. Aguobi
Shamsuddeen Hassan Muhammad
David Ifeoluwa Adelani
110
4
0
18 Aug 2023
Playing with words: Comparing the vocabulary and lexical diversity of ChatGPT and humans
Pedro Reviriego
Javier Conde
Elena Merino-Gómez
Gonzalo Martínez
José Alberto Hernández
37
9
0
14 Aug 2023
Extrapolating Large Language Models to Non-English by Aligning Languages
Wenhao Zhu
Yunzhe Lv
Qingxiu Dong
Fei Yuan
Jingjing Xu
Shujian Huang
Lingpeng Kong
Jiajun Chen
Lei Li
102
72
0
09 Aug 2023
ChatGPT for Arabic Grammatical Error Correction
S. Kwon
Gagan Bhatia
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
75
7
0
08 Aug 2023
Character-level NMT and language similarity
Josef Jon
Ondrej Bojar
45
0
0
08 Aug 2023
Towards Multiple References Era -- Addressing Data Leakage and Limited Reference Diversity in NLG Evaluation
Xianfeng Zeng
Yanjun Liu
Fandong Meng
Jie Zhou
55
0
0
06 Aug 2023
TARJAMAT: Evaluation of Bard and ChatGPT on Machine Translation of Ten Arabic Varieties
Karima Kadaoui
Samar Magdy
Abdul Waheed
Md. Tawkat Islam Khondaker
Ahmed Oumar El-Shangiti
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
110
23
0
06 Aug 2023
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu
Zhengyuan Yang
Linjie Li
Jianfeng Wang
Kevin Qinghong Lin
Zicheng Liu
Xinchao Wang
Lijuan Wang
MLLM
152
720
0
04 Aug 2023
Sinhala-English Parallel Word Dictionary Dataset
Kasun Wickramasinghe
Nisansa de Silva
65
3
0
04 Aug 2023
Do Multilingual Language Models Think Better in English?
Julen Etxaniz
Gorka Azkune
Aitor Soroa Etxabe
Oier López de Lacalle
Mikel Artetxe
LRM
90
73
0
02 Aug 2023
An Unforgeable Publicly Verifiable Watermark for Large Language Models
Aiwei Liu
Leyi Pan
Xuming Hu
Shuang Li
Lijie Wen
Irwin King
Philip S. Yu
WaLM
126
37
0
30 Jul 2023
Multilingual Lexical Simplification via Paraphrase Generation
Kang Liu
Jipeng Qiang
Yun Li
Yunhao Yuan
Yi Zhu
Kaixun Hua
61
3
0
28 Jul 2023
Milimili. Collecting Parallel Data via Crowdsourcing
Alexander Antonov
FedML
20
0
0
23 Jul 2023
Multilingual Speech-to-Speech Translation into Multiple Target Languages
Hongyu Gong
Ning Dong
Sravya Popuri
Vedanuj Goswami
Ann Lee
J. Pino
82
5
0
17 Jul 2023
mBLIP: Efficient Bootstrapping of Multilingual Vision-LLMs
Gregor Geigle
Abhay Jain
Radu Timofte
Goran Glavaš
VLM
MLLM
123
32
0
13 Jul 2023
Pluggable Neural Machine Translation Models via Memory-augmented Adapters
Yuzhuang Xu
Shuo Wang
Peng Li
Xuebo Liu
Xiaolong Wang
Weidong Liu
Yang Liu
104
1
0
12 Jul 2023
TIM: Teaching Large Language Models to Translate with Comparison
Jiali Zeng
Fandong Meng
Yongjing Yin
Jie Zhou
118
57
0
10 Jul 2023
Should you marginalize over possible tokenizations?
Nadezhda Chirkova
Germán Kruszewski
Jos Rozen
Marc Dymetman
94
12
0
30 Jun 2023
Towards Measuring the Representation of Subjective Global Opinions in Language Models
Esin Durmus
Karina Nyugen
Thomas I. Liao
Nicholas Schiefer
Amanda Askell
...
Alex Tamkin
Janel Thamkul
Jared Kaplan
Jack Clark
Deep Ganguli
147
245
0
28 Jun 2023
xSIM++: An Improved Proxy to Bitext Mining Performance for Low-Resource Languages
Mingda Chen
Kevin Heffernan
Onur cCelebi
Alexandre Mourachko
Holger Schwenk
54
3
0
22 Jun 2023
Unveiling Global Narratives: A Multilingual Twitter Dataset of News Media on the Russo-Ukrainian Conflict
Sherzod Hakimov
Gullal Singh Cheema
55
3
0
22 Jun 2023
Multilingual Neural Machine Translation System for Indic to Indic Languages
Sudhansu Bala Das
Divyajyoti Panda
T. K. Mishra
Bidyut Kr. Patra
Asif Ekbal
71
0
0
22 Jun 2023
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
Xuan-Phi Nguyen
Sharifah Mahani Aljunied
Shafiq Joty
Lidong Bing
118
38
0
20 Jun 2023
BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models
Shaolei Zhang
Qingkai Fang
Zhuocheng Zhang
Zhengrui Ma
Yan Zhou
...
Mengyu Bu
Shangtong Gui
Yunji Chen
Xilin Chen
Yang Feng
ALM
133
42
0
19 Jun 2023
Sheffield's Submission to the AmericasNLP Shared Task on Machine Translation into Indigenous Languages
Edward Gow-Smith
Danae Sánchez Villegas
84
9
0
16 Jun 2023
Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations
Gregor Geigle
Radu Timofte
Goran Glavaš
VLM
MLLM
61
5
0
14 Jun 2023
NAVER LABS Europe's Multilingual Speech Translation Systems for the IWSLT 2023 Low-Resource Track
Edward Gow-Smith
Alexandre Berard
Marcely Zanon Boito
Ioan Calapodescu
72
13
0
13 Jun 2023
Measuring Sentiment Bias in Machine Translation
Kai Hartung
Aaricia Herygers
Shubham Kurlekar
Khabbab Zakaria
Taylan Volkan
Sören Gröttrup
Munir Georges
AI4CE
70
5
0
12 Jun 2023
Learning Multilingual Sentence Representations with Cross-lingual Consistency Regularization
Pengzhi Gao
Liwen Zhang
Zhongjun He
Hua Wu
Haifeng Wang
66
7
0
12 Jun 2023
WSPAlign: Word Alignment Pre-training via Large-Scale Weakly Supervised Span Prediction
Qiyu Wu
Masaaki Nagata
Yoshimasa Tsuruoka
61
5
0
09 Jun 2023
Customizing General-Purpose Foundation Models for Medical Report Generation
Bang-ju Yang
Asif Raza
Yuexian Zou
Tong Zhang
MedIm
87
11
0
09 Jun 2023
M
3
^3
3
IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning
Lei Li
Yuwei Yin
Shicheng Li
Liang Chen
Peiyi Wang
...
Yazheng Yang
Jingjing Xu
Xu Sun
Lingpeng Kong
Qi Liu
MLLM
VLM
96
120
0
07 Jun 2023
Previous
1
2
3
...
12
13
14
15
16
17
Next