ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2210.07135
  4. Cited By
You Can Have Your Data and Balance It Too: Towards Balanced and
  Efficient Multilingual Models

You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models

13 October 2022
Tomasz Limisiewicz
Daniel Malkin
Gabriel Stanovsky
ArXivPDFHTML

Papers citing "You Can Have Your Data and Balance It Too: Towards Balanced and Efficient Multilingual Models"

8 / 8 papers shown
Title
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Optimal word order for non-causal text generation with Large Language Models: the Spanish case
Andrea Busto-Castiñeira
Silvia García-Méndez
Francisco de Arriba-Pérez
Francisco J. González Castaño
41
0
0
21 Feb 2025
MAGNET: Improving the Multilingual Fairness of Language Models with
  Adaptive Gradient-Based Tokenization
MAGNET: Improving the Multilingual Fairness of Language Models with Adaptive Gradient-Based Tokenization
Orevaoghene Ahia
Sachin Kumar
Hila Gonen
Valentin Hoffman
Tomasz Limisiewicz
Yulia Tsvetkov
Noah A. Smith
51
4
0
11 Jul 2024
A Representative Study on Human Detection of Artificially Generated
  Media Across Countries
A Representative Study on Human Detection of Artificially Generated Media Across Countries
Joel Frank
Franziska Herbert
Jonas Ricker
Lea Schonherr
Thorsten Eisenhofer
Asja Fischer
Markus Dürmuth
Thorsten Holz
38
13
0
10 Dec 2023
PuoBERTa: Training and evaluation of a curated language model for
  Setswana
PuoBERTa: Training and evaluation of a curated language model for Setswana
Vukosi Marivate
Moseli Motsóehli
Valencia Wagner
Richard Lastrucci
Isheanesu Dzingirai
27
8
0
13 Oct 2023
Probing Classifiers: Promises, Shortcomings, and Advances
Probing Classifiers: Promises, Shortcomings, and Advances
Yonatan Belinkov
226
405
0
24 Feb 2021
How Good is Your Tokenizer? On the Monolingual Performance of
  Multilingual Language Models
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models
Phillip Rust
Jonas Pfeiffer
Ivan Vulić
Sebastian Ruder
Iryna Gurevych
80
235
0
31 Dec 2020
When Being Unseen from mBERT is just the Beginning: Handling New
  Languages With Multilingual Language Models
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
Benjamin Muller
Antonis Anastasopoulos
Benoît Sagot
Djamé Seddah
LRM
134
165
0
24 Oct 2020
Google's Neural Machine Translation System: Bridging the Gap between
  Human and Machine Translation
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
Yonghui Wu
M. Schuster
Z. Chen
Quoc V. Le
Mohammad Norouzi
...
Alex Rudnick
Oriol Vinyals
G. Corrado
Macduff Hughes
J. Dean
AIMat
716
6,746
0
26 Sep 2016
1