ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2207.04672
  4. Cited By
No Language Left Behind: Scaling Human-Centered Machine Translation
v1v2v3 (latest)

No Language Left Behind: Scaling Human-Centered Machine Translation

11 July 2022
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
Kenneth Heafield
Kevin Heffernan
Elahe Kalbassi
Janice Lam
Daniel Licht
Jean Maillard
Anna Y. Sun
Skyler Wang
Guillaume Wenzek
Alison Youngblood
Bapi Akula
Loïc Barrault
Gabriel Mejia Gonzalez
Prangthip Hansanti
John Hoffman
Semarley Jarrett
Kaushik Ram Sadagopan
Dirk Rowe
Shannon L. Spruit
C. Tran
Pierre Yves Andrews
Necip Fazil Ayan
Shruti Bhosale
Sergey Edunov
Angela Fan
Cynthia Gao
Vedanuj Goswami
Francisco Guzmán
Philipp Koehn
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
    MoE
ArXiv (abs)PDFHTMLGithub (31473★)

Papers citing "No Language Left Behind: Scaling Human-Centered Machine Translation"

50 / 801 papers shown
Title
Curriculum Learning for Cross-Lingual Data-to-Text Generation With Noisy
  Data
Curriculum Learning for Cross-Lingual Data-to-Text Generation With Noisy Data
Kancharla Aditya Hari
Manish Gupta
Vasudeva Varma
154
0
0
18 Dec 2024
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation
Vera Neplenbroek
Arianna Bisazza
Raquel Fernández
209
1
0
18 Dec 2024
Beyond Data Quantity: Key Factors Driving Performance in Multilingual
  Language Models
Beyond Data Quantity: Key Factors Driving Performance in Multilingual Language Models
Sina Bagheri Nezhad
Ameeta Agrawal
Rhitabrat Pokharel
LRM
119
2
0
17 Dec 2024
Multilingual and Explainable Text Detoxification with Parallel Corpora
Multilingual and Explainable Text Detoxification with Parallel Corpora
Daryna Dementieva
N. Babakov
Amit Ronen
Abinew Ali Ayele
Naquee Rizwan
...
Elisei Stakovskii
Eran Kaufman
Ashraf Elnagar
Animesh Mukherjee
Alexander Panchenko
134
3
0
16 Dec 2024
MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation
MT-LENS: An all-in-one Toolkit for Better Machine Translation Evaluation
Javier García Gilabert
Carlos Escolano
Audrey Mash
Xixian Liao
Maite Melero
AIMatELM
126
0
0
16 Dec 2024
Task-Oriented Dialog Systems for the Senegalese Wolof Language
Task-Oriented Dialog Systems for the Senegalese Wolof Language
Derguene Mbaye
Moussa Diallo
123
0
0
15 Dec 2024
Analyzing the Attention Heads for Pronoun Disambiguation in
  Context-aware Machine Translation Models
Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models
Paweł Mąka
Yusuf Can Semerci
Jan Scholtes
Gerasimos Spanakis
113
0
0
15 Dec 2024
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
jina-clip-v2: Multilingual Multimodal Embeddings for Text and Images
Andreas Koukounas
Georgios Mastrapas
Bo Wang
Mohammad Kalim Akram
Sedigheh Eslami
Michael Gunther
Isabelle Mohr
Saba Sturua
Scott Martens
Nan Wang
VLM
355
10
0
11 Dec 2024
MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation
MIT-10M: A Large Scale Parallel Corpus of Multilingual Image Translation
Bo Li
Shaolin Zhu
Lijie Wen
VLM
127
2
0
10 Dec 2024
A polar coordinate system represents syntax in large language models
A polar coordinate system represents syntax in large language models
Pablo Diego-Simón
Stéphane DÁscoli
Emmanuel Chemla
Yair Lakretz
J. King
LLMSV
121
0
0
07 Dec 2024
SailCompass: Towards Reproducible and Robust Evaluation for Southeast
  Asian Languages
SailCompass: Towards Reproducible and Robust Evaluation for Southeast Asian Languages
Jia Guo
Longxu Dou
Guangtao Zeng
Stanley Kok
Wei Lu
Qian Liu
ELMLRM
129
2
0
02 Dec 2024
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Sen Xing
Muyan Zhong
Zeqiang Lai
Liangchen Li
Jing Liu
Yaohui Wang
Jifeng Dai
Wenhai Wang
207
2
0
02 Dec 2024
Uhura: A Benchmark for Evaluating Scientific Question Answering and
  Truthfulness in Low-Resource African Languages
Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages
Edward Bayes
Israel Abebe Azime
Jesujoba Oluwadara Alabi
Jonas Kgomo
Tyna Eloundou
...
Shamsuddeen Hassan Muhammad
Choice Mpanza
Igneciah Pocia Thete
Dietrich Klakow
David Ifeoluwa Adelani
HILMELM
144
7
0
01 Dec 2024
AMPS: ASR with Multimodal Paraphrase Supervision
AMPS: ASR with Multimodal Paraphrase Supervision
Amruta Parulekar
Abhishek Gupta
Sameep Chattopadhyay
Preethi Jyothi
143
0
0
27 Nov 2024
PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning
PEFTGuard: Detecting Backdoor Attacks Against Parameter-Efficient Fine-Tuning
Zhen Sun
Tianshuo Cong
Yule Liu
Chenhao Lin
Xinlei He
Rongmao Chen
Xingshuo Han
Xinyi Huang
AAML
172
6
0
26 Nov 2024
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
All Languages Matter: Evaluating LMMs on Culturally Diverse 100 Languages
Ashmal Vayani
Dinura Dissanayake
Hasindri Watawana
Noor Ahsan
Nevasini Sasikumar
...
Monojit Choudhury
Ivan Laptev
Mubarak Shah
Salman Khan
Fahad A Khan
256
16
0
25 Nov 2024
Seed-Free Synthetic Data Generation Framework for Instruction-Tuning
  LLMs: A Case Study in Thai
Seed-Free Synthetic Data Generation Framework for Instruction-Tuning LLMs: A Case Study in Thai
Parinthapat Pengpun
Can Udomcharoenchaikit
Weerayut Buaphet
Peerat Limkonchotiwat
SyDa
154
2
0
23 Nov 2024
Why do language models perform worse for morphologically complex
  languages?
Why do language models perform worse for morphologically complex languages?
Catherine Arnett
Benjamin Bergen
114
12
0
21 Nov 2024
Low-resource Machine Translation: what for? who for? An observational study on a dedicated Tetun language translation service
Low-resource Machine Translation: what for? who for? An observational study on a dedicated Tetun language translation service
Raphael Merx
Hanna Suominen
Adérito José Guterres Correia
Trevor Cohn
223
2
0
19 Nov 2024
Dialectal Toxicity Detection: Evaluating LLM-as-a-Judge Consistency Across Language Varieties
Fahim Faisal
Md. Mushfiqur Rahman
Antonios Anastasopoulos
49
4
0
17 Nov 2024
CULL-MT: Compression Using Language and Layer pruning for Machine
  Translation
CULL-MT: Compression Using Language and Layer pruning for Machine Translation
Pedram Rostami
M. Dousti
97
1
0
10 Nov 2024
Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small
  Language Models
Fineweb-Edu-Ar: Machine-translated Corpus to Support Arabic Small Language Models
Sultan Alrashed
Dmitrii Khizbullin
David R. Pugh
69
0
0
10 Nov 2024
Using Language Models to Disambiguate Lexical Choices in Translation
Using Language Models to Disambiguate Lexical Choices in Translation
Josh Barua
Sanjay Subramanian
Kayo Yin
Alane Suhr
40
1
0
08 Nov 2024
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Fine-Grained Reward Optimization for Machine Translation using Error Severity Mappings
Miguel Moura Ramos
Tomás Almeida
Daniel Vareta
Filipe Azevedo
Sweta Agrawal
Patrick Fernandes
André F. T. Martins
123
4
0
08 Nov 2024
FASSILA: A Corpus for Algerian Dialect Fake News Detection and Sentiment
  Analysis
FASSILA: A Corpus for Algerian Dialect Fake News Detection and Sentiment Analysis
Amin Abdedaiem
Abdelhalim Hafedh Dahou
Mohamed Amine Cheragui
Brigitte Mathiak
CVBM
80
2
0
07 Nov 2024
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
VTechAGP: An Academic-to-General-Audience Text Paraphrase Dataset and Benchmark Models
Ming Cheng
Jiaying Gong
Chenhan Yuan
William A. Ingram
Edward A. Fox
Hoda Eldardiry
240
1
0
07 Nov 2024
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with
  Captions in 28 Languages
No Culture Left Behind: ArtELingo-28, a Benchmark of WikiArt with Captions in 28 Languages
Youssef Mohamed
Runjia Li
Ibrahim Said Ahmad
Kilichbek Haydarov
Philip Torr
Kenneth Church
Mohamed Elhoseiny
VLM
94
11
0
06 Nov 2024
Mitigating Metric Bias in Minimum Bayes Risk Decoding
Mitigating Metric Bias in Minimum Bayes Risk Decoding
Geza Kovacs
Daniel Deutsch
Markus Freitag
107
8
0
05 Nov 2024
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
Code-Switching Curriculum Learning for Multilingual Transfer in LLMs
Haneul Yoo
Cheonbok Park
Sangdoo Yun
Alice Oh
Hwaran Lee
93
5
0
04 Nov 2024
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages
Prompting with Phonemes: Enhancing LLMs' Multilinguality for Non-Latin Script Languages
Hoang Nguyen
Khyati Mahajan
Vikas Yadav
Philip S. Yu
Philip S. Yu
Masoud Hashemi
Rishabh Maheshwary
138
0
0
04 Nov 2024
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
MoCE: Adaptive Mixture of Contextualization Experts for Byte-based Neural Machine Translation
Langlin Huang
Mengyu Bu
Yang Feng
101
0
0
03 Nov 2024
SPRING Lab IITM's submission to Low Resource Indic Language Translation
  Shared Task
SPRING Lab IITM's submission to Low Resource Indic Language Translation Shared Task
Hamees Sayed
Advait Joglekar
S. Umesh
86
1
0
01 Nov 2024
Leveraging LLMs for MT in Crisis Scenarios: a blueprint for low-resource
  languages
Leveraging LLMs for MT in Crisis Scenarios: a blueprint for low-resource languages
Séamus Lankford
Andy Way
100
0
0
31 Oct 2024
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
Amir Hossein Kargaran
François Yvon
Hinrich Schutze
VLM
121
8
0
31 Oct 2024
Crowdsourcing Lexical Diversity
Crowdsourcing Lexical Diversity
H. Khalilia
Jahna Otterbacher
Gábor Bella
Rusma Noortyani
Shandy Darma
Fausto Giunchiglia
66
2
0
30 Oct 2024
ProMoE: Fast MoE-based LLM Serving using Proactive Caching
ProMoE: Fast MoE-based LLM Serving using Proactive Caching
Xiaoniu Song
Zihang Zhong
Rong Chen
Haibo Chen
MoE
134
6
0
29 Oct 2024
Revisiting Reliability in Large-Scale Machine Learning Research Clusters
Revisiting Reliability in Large-Scale Machine Learning Research Clusters
Apostolos Kokolis
Michael Kuchnik
John Hoffman
Adithya Kumar
Parth Malani
Faye Ma
Zachary DeVito
Siyang Song
Kalyan Saladi
Carole-Jean Wu
330
9
0
29 Oct 2024
From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers
  for Underrepresented Languages
From English-Centric to Effective Bilingual: LLMs with Custom Tokenizers for Underrepresented Languages
Artur Kiulian
Anton Polishko
M. Khandoga
Yevhen Kostiuk
Guillermo Gabrielli
...
Hrishikesh Garud
Wendy Wing Yee Mak
Dmytro Chaplynskyi
Selma Belhadj Amor
Grigol Peradze
84
0
0
24 Oct 2024
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
How Good Are LLMs for Literary Translation, Really? Literary Translation Evaluation with Humans and LLMs
Ran Zhang
Wei Zhao
Steffen Eger
142
10
0
24 Oct 2024
Together We Can: Multilingual Automatic Post-Editing for Low-Resource
  Languages
Together We Can: Multilingual Automatic Post-Editing for Low-Resource Languages
S. Deoghare
Diptesh Kanojia
Pushpak Bhattacharyya
51
0
0
23 Oct 2024
MojoBench: Language Modeling and Benchmarks for Mojo
MojoBench: Language Modeling and Benchmarks for Mojo
Nishat Raihan
Joanna C. S. Santos
Marcos Zampieri
86
2
0
23 Oct 2024
Can General-Purpose Large Language Models Generalize to English-Thai
  Machine Translation ?
Can General-Purpose Large Language Models Generalize to English-Thai Machine Translation ?
Jirat Chiaranaipanich
Naiyarat Hanmatheekuna
Jitkapat Sawatphol
Krittamate Tiankanon
Jiramet Kinchagawat
Amrest Chinkamol
Parinthapat Pengpun
Piyalitt Ittichaiwong
Peerat Limkonchotiwat
31
0
0
22 Oct 2024
VEMOCLAP: A video emotion classification web application
VEMOCLAP: A video emotion classification web application
Serkan Sulun
Paula Viana
M. Davies
VLM
84
0
0
22 Oct 2024
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large
  Language Model
Exploring Continual Fine-Tuning for Enhancing Language Ability in Large Language Model
Divyanshu Aggarwal
Sankarshan Damle
Navin Goyal
Satya Lokam
Sunayana Sitaram
CLL
62
1
0
21 Oct 2024
Back to School: Translation Using Grammar Books
Back to School: Translation Using Grammar Books
Jonathan Hus
Antonios Anastasopoulos
AI4Ed
47
4
0
20 Oct 2024
Grammatical Error Correction for Low-Resource Languages: The Case of Zarma
Grammatical Error Correction for Low-Resource Languages: The Case of Zarma
Mamadou K. Keita
Christopher Homan
Sofiane Abdoulaye Hamani
Adwoa Bremang
Marcos Zampieri
Habibatou Abdoulaye Alfari
Elysabhete Amadou Ibrahim
127
0
0
20 Oct 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James V. Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
138
24
0
20 Oct 2024
A Complexity-Based Theory of Compositionality
A Complexity-Based Theory of Compositionality
Eric Elmoznino
Thomas Jiralerspong
Yoshua Bengio
Guillaume Lajoie
CoGe
157
10
0
18 Oct 2024
Towards Cross-Cultural Machine Translation with Retrieval-Augmented
  Generation from Multilingual Knowledge Graphs
Towards Cross-Cultural Machine Translation with Retrieval-Augmented Generation from Multilingual Knowledge Graphs
Simone Conia
Daniel Lee
Min Li
U. F. Minhas
Saloni Potdar
Yunyao Li
83
11
0
17 Oct 2024
Parameter-efficient Adaptation of Multilingual Multimodal Models for
  Low-resource ASR
Parameter-efficient Adaptation of Multilingual Multimodal Models for Low-resource ASR
Abhishek Gupta
Amruta Parulekar
Sameep Chattopadhyay
Preethi Jyothi
VLM
53
0
0
17 Oct 2024
Previous
12345...151617
Next