Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
1907.05791
Cited By
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia
10 July 2019
Holger Schwenk
Vishrav Chaudhary
Shuo Sun
Hongyu Gong
Francisco Guzmán
CVBM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia"
50 / 225 papers shown
Title
Not All LoRA Parameters Are Essential: Insights on Inference Necessity
Guanhua Chen
Yutong Yao
Ci-Jun Gao
Lidia S. Chao
Feng Wan
Derek F. Wong
39
0
0
30 Mar 2025
A kinetic-based regularization method for data science applications
Abhisek Ganguly
Alessandro Gabbana
Vybhav Rao
Sauro Succi
Santosh Ansumali
57
0
0
06 Mar 2025
Solving Word-Sense Disambiguation and Word-Sense Induction with Dictionary Examples
Tadej Škvorc
Marko Robnik-Šikonja
58
0
0
06 Mar 2025
Few-Shot Multilingual Open-Domain QA from 5 Examples
Fan Jiang
Tom Drummond
Trevor Cohn
53
0
0
27 Feb 2025
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts
Muhammad Farid Adilazuarda
M. Wijanarko
Lucky Susanto
Khumaisa Nuráini
Derry Wijaya
Alham Fikri Aji
57
0
0
25 Feb 2025
AFRIDOC-MT: Document-level MT Corpus for African Languages
Jesujoba Oluwadara Alabi
Israel Abebe Azime
Miaoran Zhang
C. España-Bonet
Rachel Bawden
...
Shamsuddeen Hassan Muhammad
Neo Putini
David O. Ademuyiwa
Andrew Caines
Dietrich Klakow
39
0
0
10 Jan 2025
How far can bias go? -- Tracing bias from pretraining data to alignment
Marion Thaler
Abdullatif Köksal
Alina Leidinger
Anna Korhonen
Hinrich Schutze
74
0
0
28 Nov 2024
Responsible Multilingual Large Language Models: A Survey of Development, Applications, and Societal Impact
Junhua Liu
Bin Fu
LRM
37
1
0
23 Oct 2024
Ukrainian-to-English folktale corpus: Parallel corpus creation and augmentation for machine translation in low-resource languages
Olena Burda-Lassen
32
3
0
14 Oct 2024
State of NLP in Kenya: A Survey
Cynthia Jayne Amol
Everlyn Asiko Chimoto
Rose Delilah Gesicho
Antony M. Gitau
Naome A. Etori
...
Catherine Gitau
Antony Ndolo
Lilian D. A. Wanzare
Albert Njoroge Kahira
Ronald Tombe
34
1
0
13 Oct 2024
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
43
2
0
12 Oct 2024
Cross-lingual Human-Preference Alignment for Neural Machine Translation with Direct Quality Optimization
Kaden Uhlig
Joern Wuebker
Raphael Reinauer
John DeNero
43
0
0
26 Sep 2024
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models
Shaoxiong Ji
Zihao Li
Indraneil Paul
Jaakko Paavola
Peiqin Lin
...
Dayyán O'Brien
Hengyu Luo
Hinrich Schütze
Jörg Tiedemann
Barry Haddow
CLL
43
3
0
26 Sep 2024
Scaling Laws of Decoder-Only Models on the Multilingual Machine Translation Task
Gaëtan Caillaut
Raheel Qader
Mariam Nakhlé
Jingshu Liu
Jean-Gabriel Barthélemy
33
1
0
23 Sep 2024
Distilling Monolingual and Crosslingual Word-in-Context Representations
Yuki Arase
Tomoyuki Kajiwara
25
0
0
13 Sep 2024
Goldfish: Monolingual Language Models for 350 Languages
Tyler A. Chang
Catherine Arnett
Zhuowen Tu
Benjamin Bergen
LRM
51
4
0
19 Aug 2024
Improving Multilingual Neural Machine Translation by Utilizing Semantic and Linguistic Features
Mengyu Bu
Shuhao Gu
Yang Feng
36
3
0
02 Aug 2024
Generating Gender Alternatives in Machine Translation
Sarthak Garg
Mozhdeh Gheini
Clara Emmanuel
Tatiana Likhomanenko
Qin Gao
Matthias Paulik
41
3
0
29 Jul 2024
FFN: a Fine-grained Chinese-English Financial Domain Parallel Corpus
Yuxin Fu
Shijing Si
Leyi Mai
Xi-ang Li
47
1
0
27 Jun 2024
Latent Space Translation via Inverse Relative Projection
Valentino Maiorca
Luca Moschella
Marco Fumero
Francesco Locatello
Emanuele Rodolà
47
1
0
21 Jun 2024
Selected Languages are All You Need for Cross-lingual Truthfulness Transfer
Weihao Liu
Ning Wu
Wenbiao Ding
Shining Liang
Ming Gong
Dongmei Zhang
HILM
40
2
0
20 Jun 2024
ProxyLM: Predicting Language Model Performance on Multilingual Tasks via Proxy Models
David Anugraha
Genta Indra Winata
Chenyue Li
Patrick Amadeus Irawan
En-Shiun Annie Lee
41
7
0
13 Jun 2024
Recovering document annotations for sentence-level bitext
R. Wicks
Matt Post
Philipp Koehn
39
4
0
06 Jun 2024
Critical Learning Periods: Leveraging Early Training Dynamics for Efficient Data Pruning
E. Chimoto
Jay Gala
Orevaoghene Ahia
Julia Kreutzer
Bruce A. Bassett
Sara Hooker
VLM
46
4
0
29 May 2024
Dynamic data sampler for cross-language transfer learning in large language models
Yudong Li
Yuhao Feng
Wen Zhou
Zhe Zhao
Linlin Shen
Cheng-An Hou
Xianxu Hou
46
5
0
17 May 2024
A Japanese-Chinese Parallel Corpus Using Crowdsourcing for Web Mining
Masaaki Nagata
Makoto Morishita
Katsuki Chousa
Norihito Yasuda
29
2
0
15 May 2024
Relay Decoding: Concatenating Large Language Models for Machine Translation
Chengpeng Fu
Xiaocheng Feng
Yi-Chong Huang
Wenshuai Huo
Baohang Li
Hui Wang
Bing Qin
Ting Liu
34
0
0
05 May 2024
The Power of Question Translation Training in Multilingual Reasoning: Broadened Scope and Deepened Insights
Wenhao Zhu
Shujian Huang
Fei Yuan
Cheng Chen
Jiajun Chen
Alexandra Birch
LRM
52
5
0
02 May 2024
Modeling Orthographic Variation in Occitan's Dialects
Zachary Hopton
Noemi Aepli
35
2
0
30 Apr 2024
Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair
Yusuke Sakai
Mana Makinae
Hidetaka Kamigaito
Taro Watanabe
40
4
0
18 Apr 2024
Charles Translator: A Machine Translation System between Ukrainian and Czech
Martin Popel
Lucie Poláková
Michal Novák
Jindřich Helcl
Jindrich Libovický
Pavel Stranák
Tomás Krabac
Jaroslava Hlavácová
Mariia Anisimova
Tereza Chlanová
27
0
0
10 Apr 2024
Multilingual Large Language Model: A Survey of Resources, Taxonomy and Frontiers
Libo Qin
Qiguang Chen
Yuhang Zhou
Zhi Chen
Hai-Tao Zheng
Lizi Liao
Min Li
Wanxiang Che
Philip S. Yu
LRM
57
36
0
07 Apr 2024
Teaching Llama a New Language Through Cross-Lingual Knowledge Transfer
Hele-Andra Kuulmets
Taido Purason
Agnes Luhtaru
Mark Fishel
29
17
0
05 Apr 2024
Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems
Frank Palma Gomez
Ramon Sanabria
Yun-hsuan Sung
Daniel Cer
Siddharth Dalmia
Gustavo Hernández Ábrego
VLM
41
4
0
02 Apr 2024
Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
Chenming Tang
Zhixiang Wang
Yunfang Wu
31
1
0
28 Mar 2024
Improving Vietnamese-English Medical Machine Translation
Nhu Vo
Dat Quoc Nguyen
Dung D. Le
Massimo Piccardi
Wray Buntine
LM&MA
40
0
0
28 Mar 2024
Semantically Enriched Cross-Lingual Sentence Embeddings for Crisis-related Social Media Texts
Rabindra Lamsal
M. Read
S. Karunasekera
42
1
0
25 Mar 2024
LLMs Are Few-Shot In-Context Low-Resource Language Learners
Samuel Cahyawijaya
Holy Lovenia
Pascale Fung
48
37
0
25 Mar 2024
Pointer-Generator Networks for Low-Resource Machine Translation: Don't Copy That!
Niyati Bafna
Philipp Koehn
David Yarowsky
40
1
0
16 Mar 2024
Tower: An Open Multilingual Large Language Model for Translation-Related Tasks
Duarte M. Alves
José P. Pombal
Nuno M. Guerreiro
Pedro H. Martins
Joao Alves
...
Patrick Fernandes
Sweta Agrawal
Pierre Colombo
José G. C. de Souza
André F.T. Martins
LRM
57
132
0
27 Feb 2024
GATE X-E : A Challenge Set for Gender-Fair Translations from Weakly-Gendered Languages
Spencer Rarrick
Ranjita Naik
Sundar Poudel
Vishal Chowdhary
39
1
0
22 Feb 2024
Pixel Sentence Representation Learning
Chenghao Xiao
Zhuoxu Huang
Danlu Chen
G. Hudson
Yizhi Li
Haoran Duan
Chenghua Lin
Jie Fu
Jungong Han
Noura Al Moubayed
SSL
17
2
0
13 Feb 2024
Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora
Surangika Ranathunga
Nisansa de Silva
Menan Velayuthan
Aloka Fernando
Charitha Rathnayake
39
12
0
12 Feb 2024
Improving Machine Translation with Human Feedback: An Exploration of Quality Estimation as a Reward Model
Zhiwei He
Xing Wang
Wenxiang Jiao
ZhuoSheng Zhang
Rui Wang
Shuming Shi
Zhaopeng Tu
ALM
37
24
0
23 Jan 2024
PersianMind: A Cross-Lingual Persian-English Large Language Model
Pedram Rostami
Ali Salemi
M. Dousti
CLL
LRM
37
5
0
12 Jan 2024
POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation
Shilong Pan
Zhiliang Tian
Liang Ding
Zhen Huang
Zhihua Wen
Dongsheng Li
37
2
0
11 Jan 2024
Bridging Background Knowledge Gaps in Translation with Automatic Explicitation
HyoJung Han
Jordan L. Boyd-Graber
Marine Carpuat
74
5
0
03 Dec 2023
YUAN 2.0: A Large Language Model with Localized Filtering-based Attention
Shaohua Wu
Xudong Zhao
Shenling Wang
Jiangang Luo
Lingjun Li
...
Wei Wang
Tong Yu
Rongguo Zhang
Jiahua Zhang
Chao Wang
OSLM
56
6
0
27 Nov 2023
The Ups and Downs of Large Language Model Inference with Vocabulary Trimming by Language Heuristics
Nikolay Bogoychev
Pinzhen Chen
Barry Haddow
Alexandra Birch
33
0
0
16 Nov 2023
How Vocabulary Sharing Facilitates Multilingualism in LLaMA?
Fei Yuan
Shuai Yuan
Zhiyong Wu
Lei Li
42
10
0
15 Nov 2023
1
2
3
4
5
Next