Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1805.09822
Cited By
Filtering and Mining Parallel Data in a Joint Multilingual Space
24 May 2018
Holger Schwenk
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Filtering and Mining Parallel Data in a Joint Multilingual Space"
28 / 28 papers shown
Title
Mitigating Semantic Leakage in Cross-lingual Embeddings via Orthogonality Constraint
Dayeon Ki
Cheonbok Park
H. Kim
FedML
39
0
0
24 Sep 2024
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
42
4
0
29 May 2024
Discovering Language Model Behaviors with Model-Written Evaluations
Ethan Perez
Sam Ringer
Kamilė Lukošiūtė
Karina Nguyen
Edwin Chen
...
Danny Hernandez
Deep Ganguli
Evan Hubinger
Nicholas Schiefer
Jared Kaplan
ALM
22
367
0
19 Dec 2022
SpeechMatrix: A Large-Scale Mined Corpus of Multilingual Speech-to-Speech Translations
Paul-Ambroise Duquenne
Hongyu Gong
Ning Dong
Jingfei Du
Ann Lee
Vedanuj Goswani
Changhan Wang
J. Pino
Benoît Sagot
Holger Schwenk
45
34
0
08 Nov 2022
Very Low Resource Sentence Alignment: Luhya and Swahili
E. Chimoto
Bruce A. Bassett
CVBM
16
10
0
31 Oct 2022
SAMU-XLSR: Semantically-Aligned Multimodal Utterance-level Cross-Lingual Speech Representation
Sameer Khurana
Antoine Laurent
James R. Glass
25
36
0
17 May 2022
Cross-Lingual Phrase Retrieval
Heqi Zheng
Xiao Zhang
Zewen Chi
Heyan Huang
T. Yan
Tian Lan
Wei Wei
Xian-Ling Mao
RALM
LRM
35
3
0
19 Apr 2022
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation?
E. Lee
Sarubi Thillainathan
Shravan Nayak
Surangika Ranathunga
David Ifeoluwa Adelani
Ruisi Su
Arya D. McCarthy
VLM
21
43
0
16 Mar 2022
Learning How to Translate North Korean through South Korean
Hwichan Kim
Sangwhan Moon
Naoaki Okazaki
Mamoru Komachi
30
2
0
27 Jan 2022
Self-Supervised Knowledge Assimilation for Expert-Layman Text Style Transfer
Wenda Xu
Michael Stephen Saxon
Misha Sra
Wei Wang
MedIm
19
13
0
06 Oct 2021
Neural Machine Translation for Low-Resource Languages: A Survey
Surangika Ranathunga
E. Lee
Marjana Prifti Skenduli
Ravi Shekhar
Mehreen Alam
Rishemjit Kaur
38
236
0
29 Jun 2021
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining
Ivana Kvapilíková
Mikel Artetxe
Gorka Labaka
Eneko Agirre
Ondrej Bojar
SSL
19
36
0
21 May 2021
The Curious Case of Hallucinations in Neural Machine Translation
Vikas Raunak
Arul Menezes
Marcin Junczys-Dowmunt
44
190
0
14 Apr 2021
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment
Freda Shi
Luke Zettlemoyer
Sida I. Wang
SSL
29
32
0
01 Jan 2021
Unsupervised Bitext Mining and Translation via Self-trained Contextual Embeddings
Phillip Keung
Julian Salazar
Y. Lu
Noah A. Smith
SSL
27
25
0
15 Oct 2020
On Learning Language-Invariant Representations for Universal Machine Translation
Hao Zhao
Junjie Hu
Andrej Risteski
40
8
0
11 Aug 2020
A Multilingual Parallel Corpora Collection Effort for Indian Languages
Shashank Siripragrada
Jerin Philip
Vinay P. Namboodiri
C. V. Jawahar
VLM
32
47
0
15 Jul 2020
Cross-lingual Retrieval for Iterative Self-Supervised Training
C. Tran
Y. Tang
Xian Li
Jiatao Gu
RALM
28
72
0
16 Jun 2020
A Bilingual Generative Transformer for Semantic Sentence Embedding
John Wieting
Graham Neubig
Taylor Berg-Kirkpatrick
16
28
0
10 Nov 2019
Simple and Effective Paraphrastic Similarity from Parallel Translations
John Wieting
Kevin Gimpel
Graham Neubig
Taylor Berg-Kirkpatrick
24
49
0
30 Sep 2019
Pivot-based Transfer Learning for Neural Machine Translation between Non-English Languages
Yunsu Kim
P. Petrov
Pavel Petrushkov
Shahram Khadivi
Hermann Ney
LRM
50
80
0
20 Sep 2019
From English to Code-Switching: Transfer Learning with Strong Morphological Clues
Gustavo Aguilar
Thamar Solorio
30
38
0
11 Sep 2019
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
Holger Schwenk
Vishrav Chaudhary
Shuo Sun
Hongyu Gong
Francisco Guzmán
CVBM
24
400
0
10 Jul 2019
Low-Resource Corpus Filtering using Multilingual Sentence Embeddings
Vishrav Chaudhary
Y. Tang
Francisco Guzmán
Holger Schwenk
Philipp Koehn
29
77
0
20 Jun 2019
Improving Multilingual Sentence Embedding using Bi-directional Dual Encoder with Additive Margin Softmax
Yinfei Yang
Gustavo Hernández Ábrego
Steve Yuan
Mandy Guo
Qinlan Shen
Daniel Cer
Yun-hsuan Sung
B. Strope
R. Kurzweil
52
115
0
22 Feb 2019
Effective Parallel Corpus Mining using Bilingual Sentence Embeddings
Mandy Guo
Qinlan Shen
Yinfei Yang
Heming Ge
Daniel Cer
...
K. Stevens
Noah Constant
Yun-hsuan Sung
B. Strope
R. Kurzweil
47
111
0
31 Jul 2018
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,748
0
26 Sep 2016
Effective Approaches to Attention-based Neural Machine Translation
Thang Luong
Hieu H. Pham
Christopher D. Manning
218
7,926
0
17 Aug 2015
1