Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2210.07135
Cited By
You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models
13 October 2022
Tomasz Limisiewicz
Daniel Malkin
Gabriel Stanovsky
Re-assign community
ArXiv
PDF
HTML
Papers citing
"You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models"
8 / 8 papers shown
Title
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Andrea Busto-Castiñeira
Silvia García-Méndez
Francisco de Arriba-Pérez
Francisco J. González Castaño
41
0
0
21 Feb 2025
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Valentin Hoffman
Tomasz Limisiewicz
Yulia Tsvetkov
Noah A. Smith
51
4
0
11 Jul 2024
A Representative Study on Human Detection of Artificially Generated Media Across Countries
Joel Frank
Franziska Herbert
Jonas Ricker
Lea Schonherr
Thorsten Eisenhofer
Asja Fischer
Markus Dürmuth
Thorsten Holz
38
13
0
10 Dec 2023
PuoBERTa: Training and evaluation of a curated language model for Setswana
Vukosi Marivate
Moseli Motsóehli
Valencia Wagner
Richard Lastrucci
Isheanesu Dzingirai
30
8
0
13 Oct 2023
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
226
405
0
24 Feb 2021
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
80
235
0
31 Dec 2020
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
134
165
0
24 Oct 2020
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1