Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2301.10472
Cited By
XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models
25 January 2023
Davis Liang
Hila Gonen
Yuning Mao
Rui Hou
Naman Goyal
Marjan Ghazvininejad
Luke Zettlemoyer
Madian Khabsa
Re-assign community
ArXiv
PDF
HTML
Papers citing
"XLM-V: Overcoming the Vocabulary Bottleneck in Multilingual Masked Language Models"
17 / 17 papers shown
Title
Crosslingual Reasoning through Test-Time Scaling
Zheng-Xin Yong
Muhammad Farid Adilazuarda
Jonibek Mansurov
Ruochen Zhang
Niklas Muennighoff
Carsten Eickhoff
Genta Indra Winata
Julia Kreutzer
Stephen H. Bach
Alham Fikri Aji
LRM
ELM
157
0
0
08 May 2025
HYPEROFA: Expanding LLM Vocabulary to New Languages via Hypernetwork-Based Embedding Initialization
Enes Özeren
Yihong Liu
Hinrich Schütze
31
0
0
21 Apr 2025
Learning on LLM Output Signatures for gray-box LLM Behavior Analysis
Guy Bar-Shalom
Fabrizio Frasca
Derek Lim
Yoav Gelberg
Yftah Ziser
Ran El-Yaniv
Gal Chechik
Haggai Maron
67
0
0
18 Mar 2025
PixelWorld: Towards Perceiving Everything as Pixels
Zhiheng Lyu
Xueguang Ma
Wenhu Chen
145
0
0
31 Jan 2025
Pixology: Probing the Linguistic and Visual Capabilities of Pixel-based Language Models
Kushal Tatariya
Vladimir Araujo
Thomas Bauwens
Miryam de Lhoneux
VLM
35
0
0
15 Oct 2024
Layer Swapping for Zero-Shot Cross-Lingual Transfer in Large Language Models
Lucas Bandarkar
Benjamin Muller
Pritish Yuvraj
Rui Hou
Nayan Singhal
Hongjiang Lv
Bing-Quan Liu
KELM
LRM
MoMe
52
3
0
02 Oct 2024
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
Trinh Pham
Khoi M. Le
Luu Anh Tuan
42
1
0
14 Jun 2024
How Vocabulary Sharing Facilitates Multilingualism in LLaMA?
Fei Yuan
Shuai Yuan
Zhiyong Wu
Lei Li
37
10
0
15 Nov 2023
Analyzing Cognitive Plausibility of Subword Tokenization
Lisa Beinborn
Yuval Pinter
29
17
0
20 Oct 2023
An Efficient Multilingual Language Model Compression through Vocabulary Trimming
Asahi Ushio
Yi Zhou
Jose Camacho-Collados
41
7
0
24 May 2023
Small Models are Valuable Plug-ins for Large Language Models
Canwen Xu
Yichong Xu
Shuohang Wang
Yang Liu
Chenguang Zhu
Julian McAuley
LLMAG
41
45
0
15 May 2023
Oolong: Investigating What Makes Transfer Learning Hard with Controlled Studies
Zhengxuan Wu
Alex Tamkin
Isabel Papadimitriou
21
10
0
24 Feb 2022
Larger-Scale Transformers for Multilingual Masked Language Modeling
Naman Goyal
Jingfei Du
Myle Ott
Giridhar Anantharaman
Alexis Conneau
90
98
0
02 May 2021
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages
Abteen Ebrahimi
Manuel Mager
Arturo Oncevay
Vishrav Chaudhary
Luis Chiruzzo
...
Graham Neubig
Alexis Palmer
Rolando A. Coto Solano
Ngoc Thang Vu
Katharina Kann
109
72
0
18 Apr 2021
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
80
235
0
31 Dec 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
77
65
0
24 Oct 2020
MLQA: Evaluating Cross-lingual Extractive Question Answering
Patrick Lewis
Barlas Oğuz
Ruty Rinott
Sebastian Riedel
Holger Schwenk
ELM
246
492
0
16 Oct 2019
1