ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2112.06598
  4. Cited By
WECHSEL: Effective initialization of subword embeddings for
  cross-lingual transfer of monolingual language models

WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models

13 December 2021
Benjamin Minixhofer
Fabian Paischer
Navid Rekabsaz
ArXivPDFHTML

Papers citing "WECHSEL: Effective initialization of subword embeddings for cross-lingual transfer of monolingual language models"

40 / 40 papers shown
Title
Bielik v3 Small: Technical Report
Bielik v3 Small: Technical Report
Krzysztof Ociepa
Łukasz Flis
Remigiusz Kinas
Krzysztof Wróbel
Adrian Gwoździej
42
0
0
05 May 2025
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
Adapters for Altering LLM Vocabularies: What Languages Benefit the Most?
HyoJung Han
Akiko Eriguchi
Haoran Xu
Hieu T. Hoang
Marine Carpuat
Huda Khayrallah
VLM
56
2
0
12 Oct 2024
Large Vocabulary Size Improves Large Language Models
Large Vocabulary Size Improves Large Language Models
Sho Takase
Ryokan Ri
Shun Kiyono
Takuya Kato
65
4
0
24 Jun 2024
Linear Alignment of Vision-language Models for Image Captioning
Linear Alignment of Vision-language Models for Image Captioning
Fabian Paischer
M. Hofmarcher
Sepp Hochreiter
Thomas Adler
CLIP
VLM
95
0
0
10 Jul 2023
Societal Biases in Retrieved Contents: Measurement Framework and
  Adversarial Mitigation for BERT Rankers
Societal Biases in Retrieved Contents: Measurement Framework and Adversarial Mitigation for BERT Rankers
Navid Rekabsaz
Simone Kopeinik
Markus Schedl
28
60
0
28 Apr 2021
Switch Transformers: Scaling to Trillion Parameter Models with Simple
  and Efficient Sparsity
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Efficient Sparsity
W. Fedus
Barret Zoph
Noam M. Shazeer
MoE
52
2,136
0
11 Jan 2021
Few-Shot Question Answering by Pretraining Span Selection
Few-Shot Question Answering by Pretraining Span Selection
Ori Ram
Yuval Kirstain
Jonathan Berant
Amir Globerson
Omer Levy
55
97
0
02 Jan 2021
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
107
244
0
31 Dec 2020
As Good as New. How to Successfully Recycle English GPT-2 to Make Models
  for Other Languages
As Good as New. How to Successfully Recycle English GPT-2 to Make Models for Other Languages
Wietse de Vries
Malvina Nissim
49
76
0
10 Dec 2020
mT5: A massively multilingual pre-trained text-to-text transformer
mT5: A massively multilingual pre-trained text-to-text transformer
Linting Xue
Noah Constant
Adam Roberts
Mihir Kale
Rami Al-Rfou
Aditya Siddhant
Aditya Barua
Colin Raffel
88
2,489
0
22 Oct 2020
German's Next Language Model
German's Next Language Model
Branden Chan
Stefan Schweter
Timo Möller
54
270
0
21 Oct 2020
Pre-training via Paraphrasing
Pre-training via Paraphrasing
M. Lewis
Marjan Ghazvininejad
Gargi Ghosh
Armen Aghajanyan
Sida I. Wang
Luke Zettlemoyer
AIMat
74
160
0
26 Jun 2020
Language Models are Few-Shot Learners
Language Models are Few-Shot Learners
Tom B. Brown
Benjamin Mann
Nick Ryder
Melanie Subbiah
Jared Kaplan
...
Christopher Berner
Sam McCandlish
Alec Radford
Ilya Sutskever
Dario Amodei
BDL
458
41,106
0
28 May 2020
Are All Languages Created Equal in Multilingual BERT?
Are All Languages Created Equal in Multilingual BERT?
Shijie Wu
Mark Dredze
43
320
0
18 May 2020
Do Neural Ranking Models Intensify Gender Bias?
Do Neural Ranking Models Intensify Gender Bias?
Navid Rekabsaz
Markus Schedl
24
57
0
01 May 2020
What the [MASK]? Making Sense of Language-Specific BERT Models
What the [MASK]? Making Sense of Language-Specific BERT Models
Debora Nozza
Federico Bianchi
Dirk Hovy
106
107
0
05 Mar 2020
AraBERT: Transformer-based Model for Arabic Language Understanding
AraBERT: Transformer-based Model for Arabic Language Understanding
Wissam Antoun
Fady Baly
Hazem M. Hajj
81
959
0
28 Feb 2020
From English To Foreign Languages: Transferring Pre-trained Language
  Models
From English To Foreign Languages: Transferring Pre-trained Language Models
Ke M. Tran
30
50
0
18 Feb 2020
CamemBERT: a Tasty French Language Model
CamemBERT: a Tasty French Language Model
Louis Martin
Benjamin Muller
Pedro Ortiz Suarez
Yoann Dupont
Laurent Romary
Eric Villemonte de la Clergerie
Djamé Seddah
Benoît Sagot
81
963
0
10 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
152
6,454
0
05 Nov 2019
On the Cross-lingual Transferability of Monolingual Representations
On the Cross-lingual Transferability of Monolingual Representations
Mikel Artetxe
Sebastian Ruder
Dani Yogatama
112
786
0
25 Oct 2019
RoBERTa: A Robustly Optimized BERT Pretraining Approach
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Yinhan Liu
Myle Ott
Naman Goyal
Jingfei Du
Mandar Joshi
Danqi Chen
Omer Levy
M. Lewis
Luke Zettlemoyer
Veselin Stoyanov
AIMat
390
24,160
0
26 Jul 2019
SpanBERT: Improving Pre-training by Representing and Predicting Spans
SpanBERT: Improving Pre-training by Representing and Predicting Spans
Mandar Joshi
Danqi Chen
Yinhan Liu
Daniel S. Weld
Luke Zettlemoyer
Omer Levy
109
1,953
0
24 Jul 2019
XLNet: Generalized Autoregressive Pretraining for Language Understanding
XLNet: Generalized Autoregressive Pretraining for Language Understanding
Zhilin Yang
Zihang Dai
Yiming Yang
J. Carbonell
Ruslan Salakhutdinov
Quoc V. Le
AI4CE
178
8,386
0
19 Jun 2019
Energy and Policy Considerations for Deep Learning in NLP
Energy and Policy Considerations for Deep Learning in NLP
Emma Strubell
Ananya Ganesh
Andrew McCallum
46
2,633
0
05 Jun 2019
How multilingual is Multilingual BERT?
How multilingual is Multilingual BERT?
Telmo Pires
Eva Schlinger
Dan Garrette
LRM
VLM
129
1,392
0
04 Jun 2019
Regularization Advantages of Multilingual Neural Language Models for Low
  Resource Domains
Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains
Navid Rekabsaz
Nikolaos Pappas
James Henderson
B. K. Khonglah
S. Madikeri
38
1
0
29 May 2019
BERT: Pre-training of Deep Bidirectional Transformers for Language
  Understanding
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Jacob Devlin
Ming-Wei Chang
Kenton Lee
Kristina Toutanova
VLM
SSL
SSeg
882
93,936
0
11 Oct 2018
XNLI: Evaluating Cross-lingual Sentence Representations
XNLI: Evaluating Cross-lingual Sentence Representations
Alexis Conneau
Guillaume Lample
Ruty Rinott
Adina Williams
Samuel R. Bowman
Holger Schwenk
Veselin Stoyanov
ELM
48
1,366
0
13 Sep 2018
Adversarial Removal of Demographic Attributes from Text Data
Adversarial Removal of Demographic Attributes from Text Data
Yanai Elazar
Yoav Goldberg
FaML
63
305
0
20 Aug 2018
A robust self-learning method for fully unsupervised cross-lingual
  mappings of word embeddings
A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings
Mikel Artetxe
Gorka Labaka
Eneko Agirre
SSL
43
587
0
16 May 2018
Loss in Translation: Learning Bilingual Word Mapping with a Retrieval
  Criterion
Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion
Armand Joulin
Piotr Bojanowski
Tomas Mikolov
Hervé Jégou
Edouard Grave
63
307
0
20 Apr 2018
Word Translation Without Parallel Data
Word Translation Without Parallel Data
Alexis Conneau
Guillaume Lample
MarcÁurelio Ranzato
Ludovic Denoyer
Hervé Jégou
259
1,646
0
11 Oct 2017
Transfer Learning across Low-Resource, Related Languages for Neural
  Machine Translation
Transfer Learning across Low-Resource, Related Languages for Neural Machine Translation
Toan Q. Nguyen
David Chiang
44
209
0
31 Aug 2017
Attention Is All You Need
Attention Is All You Need
Ashish Vaswani
Noam M. Shazeer
Niki Parmar
Jakob Uszkoreit
Llion Jones
Aidan Gomez
Lukasz Kaiser
Illia Polosukhin
3DV
427
129,831
0
12 Jun 2017
Semantics derived automatically from language corpora contain human-like
  biases
Semantics derived automatically from language corpora contain human-like biases
Aylin Caliskan
J. Bryson
Arvind Narayanan
126
2,645
0
25 Aug 2016
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word
  Embeddings
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings
Tolga Bolukbasi
Kai-Wei Chang
James Zou
Venkatesh Saligrama
Adam Kalai
CVBM
FaML
45
3,115
0
21 Jul 2016
Enriching Word Vectors with Subword Information
Enriching Word Vectors with Subword Information
Piotr Bojanowski
Edouard Grave
Armand Joulin
Tomas Mikolov
NAI
SSL
VLM
187
9,924
0
15 Jul 2016
Learning Crosslingual Word Embeddings without Bilingual Corpora
Learning Crosslingual Word Embeddings without Bilingual Corpora
Long Duong
H. Kanayama
Tengfei Ma
Steven Bird
Trevor Cohn
45
115
0
30 Jun 2016
Transfer Learning for Low-Resource Neural Machine Translation
Transfer Learning for Low-Resource Neural Machine Translation
Barret Zoph
Deniz Yuret
Jonathan May
Kevin Knight
62
848
0
08 Apr 2016
1