ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2205.12654
  4. Cited By
Bitext Mining Using Distilled Sentence Representations for Low-Resource
  Languages

Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages

25 May 2022
Kevin Heffernan
Onur cCelebi
Holger Schwenk
ArXiv (abs)PDFHTML

Papers citing "Bitext Mining Using Distilled Sentence Representations for Low-Resource Languages"

36 / 36 papers shown
Title
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
76
3
0
12 Oct 2024
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Kaden Uhlig
Joern Wuebker
Raphael Reinauer
John DeNero
93
0
0
26 Sep 2024
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Mingda Li
Abhijit Mishra
Utkarsh Mujumdar
94
0
0
19 Aug 2024
Language Models are Universal Embedders
Language Models are Universal Embedders
Xin Zhang
Zehan Li
Yanzhao Zhang
Dingkun Long
Pengjun Xie
Meishan Zhang
Min Zhang
KELMELM
272
9
0
12 Oct 2023
No Language Left Behind: Scaling Human-Centered Machine Translation
No Language Left Behind: Scaling Human-Centered Machine Translation
Nllb team
Marta R. Costa-jussá
James Cross
Onur cCelebi
Maha Elbayad
...
Alexandre Mourachko
C. Ropers
Safiyyah Saleem
Holger Schwenk
Jeff Wang
MoE
234
1,268
0
11 Jul 2022
Building Machine Translation Systems for the Next Thousand Languages
Building Machine Translation Systems for the Next Thousand Languages
Ankur Bapna
Isaac Caswell
Julia Kreutzer
Orhan Firat
D. Esch
...
Apurva Shah
Yanping Huang
Zhiwen Chen
Yonghui Wu
Macduff Hughes
109
101
0
09 May 2022
DeepNet: Scaling Transformers to 1,000 Layers
DeepNet: Scaling Transformers to 1,000 Layers
Hongyu Wang
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Furu Wei
MoEAI4CE
133
162
0
01 Mar 2022
Towards the Next 1000 Languages in Multilingual Machine Translation:
  Exploring the Synergy Between Supervised and Self-Supervised Learning
Towards the Next 1000 Languages in Multilingual Machine Translation: Exploring the Synergy Between Supervised and Self-Supervised Learning
Aditya Siddhant
Ankur Bapna
Orhan Firat
Yuan Cao
Mengzhao Chen
Isaac Caswell
Xavier Garcia
ELMLRM
68
29
0
09 Jan 2022
English2Gbe: A multilingual machine translation model for {Fon/Ewe}Gbe
English2Gbe: A multilingual machine translation model for {Fon/Ewe}Gbe
Gilles Hacheme
51
4
0
13 Dec 2021
DeltaLM: Encoder-Decoder Pre-training for Language Generation and
  Translation by Augmenting Pretrained Multilingual Encoders
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders
Shuming Ma
Li Dong
Shaohan Huang
Dongdong Zhang
Alexandre Muzio
Saksham Singhal
Hany Awadalla
Xia Song
Furu Wei
SLRAI4CE
78
81
0
25 Jun 2021
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual
  Machine Translation
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation
Naman Goyal
Cynthia Gao
Vishrav Chaudhary
Peng-Jen Chen
Guillaume Wenzek
Da Ju
Sanjan Krishnan
MarcÁurelio Ranzato
Francisco Guzman
Angela Fan
124
588
0
06 Jun 2021
XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation
XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation
Sebastian Ruder
Noah Constant
Jan A. Botha
Aditya Siddhant
Orhan Firat
...
Pengfei Liu
Junjie Hu
Dan Garrette
Graham Neubig
Melvin Johnson
ELMAAMLLRM
93
189
0
15 Apr 2021
Samanantar: The Largest Publicly Available Parallel Corpora Collection
  for 11 Indic Languages
Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages
Gowtham Ramesh
Sumanth Doddapaneni
Aravinth Bheemaraj
Mayank Jobanputra
AK Raghavan
...
K. Deepak
Vivek Raghavan
Anoop Kunchukuttan
Pratyush Kumar
Mitesh Khapra
LRM
102
235
0
12 Apr 2021
AI4D -- African Language Program
AI4D -- African Language Program
Kathleen Siminyu
Godson Kalipe
D. Orlic
Jade Z. Abbott
Vukosi Marivate
...
T. Diop
Davis David
Chayma Fourati
Hatem Haddad
Malek Naski
41
21
0
06 Apr 2021
NLP for Ghanaian Languages
NLP for Ghanaian Languages
P. Azunre
Salomey Osei
S. Addo
Lawrence Asamoah Adu-Gyamfi
Stephen E. Moore
...
Standylove Birago Mensah
Lucien Mensah
Mark Amoako Marcel
A. Amponsah
J. B. Hayfron-Acquah
26
5
0
29 Mar 2021
Beyond English-Centric Multilingual Machine Translation
Beyond English-Centric Multilingual Machine Translation
Angela Fan
Shruti Bhosale
Holger Schwenk
Zhiyi Ma
Ahmed El-Kishky
...
Vitaliy Liptchinsky
Sergey Edunov
Edouard Grave
Michael Auli
Armand Joulin
LRM
96
859
0
21 Oct 2020
Participatory Research for Low-resourced Machine Translation: A Case
  Study in African Languages
Participatory Research for Low-resourced Machine Translation: A Case Study in African Languages
W. Nekoto
Vukosi Marivate
T. Matsila
Timi E. Fasubaa
T. Kolawole
...
A. Olabiyi
A. Ramkilowan
A. Oktem
Adewale Akinfaderin
Abdallah Bashir
91
210
0
05 Oct 2020
Language-agnostic BERT Sentence Embedding
Language-agnostic BERT Sentence Embedding
Fangxiaoyu Feng
Yinfei Yang
Daniel Cer
N. Arivazhagan
Wei Wang
176
915
0
03 Jul 2020
FFR v1.1: Fon-French Neural Machine Translation
FFR v1.1: Fon-French Neural Machine Translation
Bonaventure F. P. Dossou
Chris C. Emezue
67
26
0
14 Jun 2020
Extending Multilingual BERT to Low-Resource Languages
Extending Multilingual BERT to Low-Resource Languages
Zihan Wang
Karthikeyan K
Stephen D. Mayhew
Dan Roth
VLM
71
132
0
28 Apr 2020
Making Monolingual Sentence Embeddings Multilingual using Knowledge
  Distillation
Making Monolingual Sentence Embeddings Multilingual using Knowledge Distillation
Nils Reimers
Iryna Gurevych
104
1,032
0
21 Apr 2020
Multilingual Machine Translation: Closing the Gap between Shared and
  Language-specific Encoder-Decoders
Multilingual Machine Translation: Closing the Gap between Shared and Language-specific Encoder-Decoders
Carlos Escolano
Marta R. Costa-jussá
José A. R. Fonollosa
Mikel Artetxe
52
55
0
14 Apr 2020
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating
  Cross-lingual Generalization
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Junjie Hu
Sebastian Ruder
Aditya Siddhant
Graham Neubig
Orhan Firat
Melvin Johnson
ELM
208
977
0
24 Mar 2020
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB
Holger Schwenk
Guillaume Wenzek
Sergey Edunov
Edouard Grave
Armand Joulin
94
262
0
10 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
228
6,593
0
05 Nov 2019
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Nils Reimers
Iryna Gurevych
1.3K
12,332
0
27 Aug 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings
  and Challenges
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Zhiwen Chen
Yonghui Wu
96
428
0
11 Jul 2019
Benchmarking Neural Machine Translation for Southern African Languages
Benchmarking Neural Machine Translation for Southern African Languages
Laura Martinus
Jade Z. Abbott
74
18
0
17 Jun 2019
Improving Multilingual Sentence Embedding using Bi-directional Dual
  Encoder with Additive Margin Softmax
Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax
Yinfei Yang
Gustavo Hernández Ábrego
Steve Yuan
Mandy Guo
Qinlan Shen
Daniel Cer
Yun-hsuan Sung
B. Strope
R. Kurzweil
80
118
0
22 Feb 2019
Cross-lingual Language Model Pretraining
Cross-lingual Language Model Pretraining
Guillaume Lample
Alexis Conneau
116
2,751
0
22 Jan 2019
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual
  Transfer and Beyond
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
Mikel Artetxe
Holger Schwenk
3DV
158
1,018
0
26 Dec 2018
Margin-based Parallel Corpus Mining with Multilingual Sentence
  Embeddings
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings
Mikel Artetxe
Holger Schwenk
67
202
0
03 Nov 2018
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLMSSLSSeg
1.8K
95,324
0
11 Oct 2018
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Mandy Guo
Qinlan Shen
Yinfei Yang
Heming Ge
Daniel Cer
...
K. Stevens
Noah Constant
Yun-hsuan Sung
B. Strope
R. Kurzweil
111
111
0
31 Jul 2018
Filtering and Mining Parallel Data in a Joint Multilingual Space
Filtering and Mining Parallel Data in a Joint Multilingual Space
Holger Schwenk
71
109
0
24 May 2018
Learning Word Vectors for 157 Languages
Learning Word Vectors for 157 Languages
Edouard Grave
Piotr Bojanowski
Prakhar Gupta
Armand Joulin
Tomas Mikolov
SSLFaML
119
1,429
0
19 Feb 2018
1