Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2205.06266
Cited By
Lifting the Curse of Multilinguality by Pre-training Modular Transformers
12 May 2022
Jonas Pfeiffer
Naman Goyal
Xi Lin
Xian Li
James Cross
Sebastian Riedel
Mikel Artetxe
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Lifting the Curse of Multilinguality by Pre-training Modular Transformers"
50 / 58 papers shown
Title
Efficiently Democratizing Medical LLMs for 50 Languages via a Mixture of Language Family Experts
Guorui Zheng
Xidong Wang
Juhao Liang
Nuo Chen
Yuping Zheng
Benyou Wang
MoE
107
5
0
14 Oct 2024
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Lucas Bandarkar
Benjamin Muller
Pritish Yuvraj
Rui Hou
Nayan Singhal
Hongjiang Lv
Bing-Quan Liu
KELM
LRM
MoMe
72
4
0
02 Oct 2024
LangSAMP: Language-Script Aware Multilingual Pretraining
Yihong Liu
Haotian Ye
Chunlan Ma
Mingyang Wang
Hinrich Schütze
VLM
156
0
0
26 Sep 2024
Bridging the Language Gap: Enhancing Multilingual Prompt-Based Code Generation in LLMs via Zero-Shot Cross-Lingual Transfer
Mingda Li
Abhijit Mishra
Utkarsh Mujumdar
73
0
0
19 Aug 2024
Multilingual Unsupervised Neural Machine Translation with Denoising Adapters
Ahmet Üstün
Alexandre Berard
Laurent Besacier
Matthias Gallé
45
45
0
20 Oct 2021
Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters
Asa Cooper Stickland
Alexandre Berard
Vassilina Nikoulina
AI4CE
38
29
0
18 Oct 2021
Composable Sparse Fine-Tuning for Cross-Lingual Transfer
Alan Ansell
Edoardo Ponti
Anna Korhonen
Ivan Vulić
CLL
MoE
114
141
0
14 Oct 2021
Efficient Test Time Adapter Ensembling for Low-resource Language Varieties
Xinyi Wang
Yulia Tsvetkov
Sebastian Ruder
Graham Neubig
46
35
0
10 Sep 2021
Subword Mapping and Anchoring across Languages
Giorgos Vernikos
Andrei Popescu-Belis
86
12
0
09 Sep 2021
DEMix Layers: Disentangling Domains for Modular Language Modeling
Suchin Gururangan
Michael Lewis
Ari Holtzman
Noah A. Smith
Luke Zettlemoyer
KELM
MoE
89
134
0
11 Aug 2021
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization
Yi Tay
Vinh Q. Tran
Sebastian Ruder
Jai Gupta
Hyung Won Chung
Dara Bahri
Zhen Qin
Simon Baumgartner
Cong Yu
Donald Metzler
94
158
0
23 Jun 2021
Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
Rabeeh Karimi Mahabadi
James Henderson
Sebastian Ruder
MoE
97
483
0
08 Jun 2021
Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
Rabeeh Karimi Mahabadi
Sebastian Ruder
Mostafa Dehghani
James Henderson
MoE
66
307
0
08 Jun 2021
Lightweight Adapter Tuning for Multilingual Speech Translation
Hang Le
J. Pino
Changhan Wang
Jiatao Gu
D. Schwab
Laurent Besacier
91
90
0
02 Jun 2021
ByT5: Towards a token-free future with pre-trained byte-to-byte models
Linting Xue
Aditya Barua
Noah Constant
Rami Al-Rfou
Sharan Narang
Mihir Kale
Adam Roberts
Colin Raffel
83
502
0
28 May 2021
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
Abteen Ebrahimi
Manuel Mager
Arturo Oncevay
Vishrav Chaudhary
Luis Chiruzzo
...
Graham Neubig
Alexis Palmer
Rolando A. Coto Solano
Ngoc Thang Vu
Katharina Kann
140
74
0
18 Apr 2021
What to Pre-Train on? Efficient Intermediate Task Selection
Clifton A. Poth
Jonas Pfeiffer
Andreas Rucklé
Iryna Gurevych
59
99
0
16 Apr 2021
CANINE: Pre-training an Efficient Tokenization-Free Encoder for Language Representation
J. Clark
Dan Garrette
Iulia Turc
John Wieting
85
218
0
11 Mar 2021
Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution
Xavier Garcia
Noah Constant
Ankur P. Parikh
Orhan Firat
119
45
0
11 Mar 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
83
2,168
0
11 Jan 2021
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
117
250
0
31 Dec 2020
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
60
131
0
31 Dec 2020
Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual Transfer
M. Vidoni
Ivan Vulić
Goran Glavaš
71
27
0
11 Dec 2020
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
160
168
0
24 Oct 2020
Rethinking embedding coupling in pre-trained language models
Hyung Won Chung
Thibault Févry
Henry Tsai
Melvin Johnson
Sebastian Ruder
144
142
0
24 Oct 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
100
65
0
24 Oct 2020
AdapterDrop: On the Efficiency of Adapters in Transformers
Andreas Rucklé
Gregor Geigle
Max Glockner
Tilman Beck
Jonas Pfeiffer
Nils Reimers
Iryna Gurevych
112
261
0
22 Oct 2020
MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale
Andreas Rucklé
Jonas Pfeiffer
Iryna Gurevych
59
37
0
02 Oct 2020
Parsing with Multilingual BERT, a Small Corpus, and a Small Treebank
Ethan C. Chau
Lucy H. Lin
Noah A. Smith
51
15
0
29 Sep 2020
Reusing a Pretrained Language Model on Languages with Limited Corpora for Unsupervised NMT
Alexandra Chronopoulou
Dario Stojanovski
Alexander Fraser
48
33
0
16 Sep 2020
Common Sense or World Knowledge? Investigating Adapter-Based Knowledge Injection into Pretrained Transformers
Anne Lauscher
Olga Majewska
Leonardo F. R. Ribeiro
Iryna Gurevych
Nikolai Rozanov
Goran Glavaš
KELM
54
81
0
24 May 2020
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
63
322
0
18 May 2020
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
Edoardo Ponti
Goran Glavaš
Olga Majewska
Qianchu Liu
Ivan Vulić
Anna Korhonen
LRM
61
320
0
01 May 2020
AdapterFusion: Non-Destructive Task Composition for Transfer Learning
Jonas Pfeiffer
Aishwarya Kamath
Andreas Rucklé
Kyunghyun Cho
Iryna Gurevych
CLL
MoMe
127
845
0
01 May 2020
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer
Jonas Pfeiffer
Ivan Vulić
Iryna Gurevych
Sebastian Ruder
96
625
0
30 Apr 2020
UDapter: Language Adaptation for Truly Universal Dependency Parsing
Ahmet Üstün
Arianna Bisazza
G. Bouma
Gertjan van Noord
51
115
0
29 Apr 2020
Extending Multilingual BERT to Low-Resource Languages
Zihan Wang
Karthikeyan K
Stephen D. Mayhew
Dan Roth
VLM
59
129
0
28 Apr 2020
XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization
Junjie Hu
Sebastian Ruder
Aditya Siddhant
Graham Neubig
Orhan Firat
Melvin Johnson
ELM
161
970
0
24 Mar 2020
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
42
51
0
18 Feb 2020
K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
Ruize Wang
Duyu Tang
Nan Duan
Zhongyu Wei
Xuanjing Huang
Jianshu Ji
Guihong Cao
Daxin Jiang
Ming Zhou
KELM
87
553
0
05 Feb 2020
Cross-Lingual Ability of Multilingual BERT: An Empirical Study
Karthikeyan K
Zihan Wang
Stephen D. Mayhew
Dan Roth
LRM
62
338
0
17 Dec 2019
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
193
6,538
0
05 Nov 2019
On the Cross-lingual Transferability of Monolingual Representations
Mikel Artetxe
Sebastian Ruder
Dani Yogatama
169
793
0
25 Oct 2019
Simple, Scalable Adaptation for Neural Machine Translation
Ankur Bapna
N. Arivazhagan
Orhan Firat
AI4CE
95
416
0
18 Sep 2019
Slice-based Learning: A Programming Model for Residual Learning in Critical Data Slices
V. Chen
Sen Wu
Zhenzhen Weng
Alexander Ratner
Christopher Ré
50
56
0
13 Sep 2019
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRM
VLM
143
1,401
0
04 Jun 2019
Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT
Shijie Wu
Mark Dredze
VLM
SSeg
91
677
0
19 Apr 2019
fairseq: A Fast, Extensible Toolkit for Sequence Modeling
Myle Ott
Sergey Edunov
Alexei Baevski
Angela Fan
Sam Gross
Nathan Ng
David Grangier
Michael Auli
VLM
FaML
95
3,147
0
01 Apr 2019
Parameter-Efficient Transfer Learning for NLP
N. Houlsby
A. Giurgiu
Stanislaw Jastrzebski
Bruna Morrone
Quentin de Laroussilhe
Andrea Gesmundo
Mona Attariyan
Sylvain Gelly
208
4,439
0
02 Feb 2019
How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions
Goran Glavaš
Robert Litschko
Sebastian Ruder
Ivan Vulić
ELM
62
183
0
01 Feb 2019
1
2
Next