Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1811.01136
Cited By
Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings
3 November 2018
Mikel Artetxe
Holger Schwenk
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Margin-based Parallel Corpus Mining with Multilingual Sentence Embeddings"
34 / 34 papers shown
Title
Improving Retrieval-Augmented Neural Machine Translation with Monolingual Data
Maxime Bouthors
Josep Crego
François Yvon
RALM
LRM
56
0
0
30 Apr 2025
Mitigating Semantic Leakage in Cross-lingual Embeddings via Orthogonality Constraint
Dayeon Ki
Cheonbok Park
H. Kim
FedML
39
0
0
24 Sep 2024
Adaptative Bilingual Aligning Using Multilingual Sentence Embedding
Olivier Kraif
23
0
0
18 Mar 2024
Exploring Representational Disparities Between Multilingual and Bilingual Translation Models
Neha Verma
Kenton W. Murray
Kevin Duh
21
0
0
23 May 2023
Democratizing Neural Machine Translation with OPUS-MT
Jörg Tiedemann
Mikko Aulamo
Daria Bakshandaeva
M. Boggia
Stig-Arne Gronroos
Tommi Nieminen
Alessandro Raganato
Yves Scherrer
Raúl Vázquez
Sami Virpioja
18
28
0
04 Dec 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
45
34
0
08 Nov 2022
Separating Grains from the Chaff: Using Data Filtering to Improve Multilingual Translation for Low-Resourced African Languages
Idris Abdulmumin
Michael Beukman
Jesujoba Oluwadara Alabi
Chris C. Emezue
Everlyn Asiko
...
Shamsuddeen Hassan Muhammad
Mofetoluwa Adeyemi
Oreen Yousuf
Sahib Singh
T. Gwadabe
34
7
0
19 Oct 2022
Multilingual Representation Distillation with Contrastive Learning
Weiting Tan
Kevin Heffernan
Holger Schwenk
Philipp Koehn
43
16
0
10 Oct 2022
The first neural machine translation system for the Erzya language
David Dale
78
7
0
19 Sep 2022
Bitext Mining for Low-Resource Languages via Contrastive Learning
Weiting Tan
Philipp Koehn
16
4
0
23 Aug 2022
Retrieval-Augmented Multilingual Keyphrase Generation with Retriever-Generator Iterative Training
Yifan Gao
Qingyu Yin
Zheng Li
Rui Meng
Tong Zhao
Bing Yin
Irwin King
M. Lyu
RALM
36
20
0
21 May 2022
Few-shot Mining of Naturally Occurring Inputs and Outputs
Mandar Joshi
Terra Blevins
M. Lewis
Daniel S. Weld
Luke Zettlemoyer
33
1
0
09 May 2022
Learning How to Translate North Korean through South Korean
Hwichan Kim
Sangwhan Moon
Naoaki Okazaki
Mamoru Komachi
30
2
0
27 Jan 2022
Improve Sentence Alignment by Divide-and-conquer
Wu Zhang
16
0
0
18 Jan 2022
Survey of Low-Resource Machine Translation
Barry Haddow
Rachel Bawden
Antonio Valerio Miceli Barone
Jindvrich Helcl
Alexandra Birch
AIMat
33
150
0
01 Sep 2021
Aligning Cross-lingual Sentence Representations with Dual Momentum Contrast
Liang Wang
Wei-Ye Zhao
Jingming Liu
40
14
0
01 Sep 2021
Facebook AI WMT21 News Translation Task Submission
C. Tran
Shruti Bhosale
James Cross
Philipp Koehn
Sergey Edunov
Angela Fan
VLM
134
81
0
06 Aug 2021
Neural Machine Translation for Low-Resource Languages: A Survey
Surangika Ranathunga
E. Lee
Marjana Prifti Skenduli
Ravi Shekhar
Mehreen Alam
Rishemjit Kaur
38
236
0
29 Jun 2021
Exploiting Parallel Corpora to Improve Multilingual Embedding based Document and Sentence Alignment
Dilan Sachintha
Lakmali Piyarathna
Charith Rajitha
Surangika Ranathunga
24
3
0
12 Jun 2021
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining
Ivana Kvapilíková
Mikel Artetxe
Gorka Labaka
Eneko Agirre
Ondrej Bojar
SSL
19
36
0
21 May 2021
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
Freda Shi
Luke Zettlemoyer
Sida I. Wang
SSL
29
33
0
01 Jan 2021
Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings
Phillip Keung
Julian Salazar
Y. Lu
Noah A. Smith
SSL
27
25
0
15 Oct 2020
Not Low-Resource Anymore: Aligner Ensembling, Batch Filtering, and New Datasets for Bengali-English Machine Translation
Tahmid Hasan
Abhik Bhattacharjee
Kazi Samin Mubasshir
Masum Hasan
Madhusudan Basak
M. Rahman
Rifat Shahriyar
VLM
23
72
0
20 Sep 2020
CoVoST 2 and Massively Multilingual Speech-to-Text Translation
Changhan Wang
Anne Wu
J. Pino
SLR
27
72
0
20 Jul 2020
A Multilingual Parallel Corpora Collection Effort for Indian Languages
Shashank Siripragrada
Jerin Philip
Vinay P. Namboodiri
C. V. Jawahar
VLM
32
47
0
15 Jul 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
28
72
0
16 Jun 2020
GeBioToolkit: Automatic Extraction of Gender-Balanced Multilingual Corpus of Wikipedia Biographies
Marta R. Costa-jussá
P. Lin
C. España-Bonet
SyDa
31
24
0
10 Dec 2019
Simple and Effective Paraphrastic Similarity from Parallel Translations
John Wieting
Kevin Gimpel
Graham Neubig
Taylor Berg-Kirkpatrick
27
49
0
30 Sep 2019
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
Holger Schwenk
Vishrav Chaudhary
Shuo Sun
Hongyu Gong
Francisco Guzmán
CVBM
24
401
0
10 Jul 2019
Low-Resource Corpus Filtering using Multilingual Sentence Embeddings
Vishrav Chaudhary
Y. Tang
Francisco Guzmán
Holger Schwenk
Philipp Koehn
29
77
0
20 Jun 2019
Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax
Yinfei Yang
Gustavo Hernández Ábrego
Steve Yuan
Mandy Guo
Qinlan Shen
Daniel Cer
Yun-hsuan Sung
B. Strope
R. Kurzweil
52
115
0
22 Feb 2019
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
189
1,639
0
11 Oct 2017
Six Challenges for Neural Machine Translation
Philipp Koehn
Rebecca Knowles
AAML
AIMat
224
1,208
0
12 Jun 2017
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,748
0
26 Sep 2016
1