ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2304.09151
  4. Cited By
UniMax: Fairer and more Effective Language Sampling for Large-Scale
  Multilingual Pretraining

UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining

18 April 2023
Hyung Won Chung
Noah Constant
Xavier Garcia
Adam Roberts
Yi Tay
Sharan Narang
Orhan Firat
ArXivPDFHTML

Papers citing "UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining"

48 / 48 papers shown
Title
IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
IDEAL: Data Equilibrium Adaptation for Multi-Capability Language Model Alignment
Chenlin Ming
Chendi Qu
Mengzhang Cai
Qizhi Pei
Zhuoshi Pan
Yu-Hu Li
Xiaoming Duan
Lijun Wu
Conghui He
5
0
0
19 May 2025
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Uni3C: Unifying Precisely 3D-Enhanced Camera and Human Motion Controls for Video Generation
Chenjie Cao
Jingkai Zhou
Shikai Li
Jingyun Liang
Chaohui Yu
Fan Wang
Xiangyang Xue
Yanwei Fu
DiffM
VGen
68
0
0
21 Apr 2025
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models
Xinlin Zhuang
Jiahui Peng
Ren Ma
Y. Wang
Tianyi Bai
Xingjian Wei
Jiantao Qiu
Chi Zhang
Ying Qian
Conghui He
53
0
0
19 Apr 2025
SkyReels-V2: Infinite-length Film Generative Model
SkyReels-V2: Infinite-length Film Generative Model
Guibin Chen
D. Lin
Jiangping Yang
Chunze Lin
J. Zhu
...
Di Qiu
Debang Li
Zhengcong Fei
Yang Li
Yahui Zhou
DiffM
VGen
56
1
0
17 Apr 2025
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Llama-3-Nanda-10B-Chat: An Open Generative Large Language Model for Hindi
Monojit Choudhury
Shivam Chauhan
Rocktim Jyoti Das
Dhruv Sahnan
Xudong Han
...
Rituraj Joshi
Gurpreet Gosal
Avraham Sheinin
Natalia Vassilieva
Preslav Nakov
33
0
0
08 Apr 2025
Won: Establishing Best Practices for Korean Financial NLP
Won: Establishing Best Practices for Korean Financial NLP
Guijin Son
Hyunwoo Ko
Haneral Jung
Chami Hwang
49
0
0
23 Mar 2025
You Only Debias Once: Towards Flexible Accuracy-Fairness Trade-offs at Inference Time
Xiaotian Han
Tianlong Chen
Kaixiong Zhou
Zhimeng Jiang
Zhangyang Wang
Xia Hu
168
0
0
10 Mar 2025
Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh
Fajri Koto
Rituraj Joshi
Nurdaulet Mukhituly
Yunhong Wang
Zhuohan Xie
...
Avraham Sheinin
Natalia Vassilieva
Neha Sengupta
Larry Murray
Preslav Nakov
ALM
KELM
43
0
0
03 Mar 2025
YAD: Leveraging T5 for Improved Automatic Diacritization of Yor\`ub\á Text
YAD: Leveraging T5 for Improved Automatic Diacritization of Yor\`ub\á Text
Akindele Michael Olawole
Jesujoba Oluwadara Alabi
Aderonke Busayo Sakpere
David Ifeoluwa Adelani
26
0
0
31 Dec 2024
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General
  Reasoning in LLMs
QUENCH: Measuring the gap between Indic and Non-Indic Contextual General Reasoning in LLMs
Mohammad Aflah Khan
Neemesh Yadav
Sarah Masud
Md. Shad Akhtar
74
0
0
16 Dec 2024
Deploying Multi-task Online Server with Large Language Model
Deploying Multi-task Online Server with Large Language Model
Yincen Qu
Chao Ma
Xiangying Dai
Hui Zhou
Yiting Wu
Hengyue Liu
28
0
0
06 Nov 2024
Responsible Multilingual Large Language Models: A Survey of Development,
  Applications, and Societal Impact
Responsible Multilingual Large Language Models: A Survey of Development, Applications, and Societal Impact
Junhua Liu
Bin Fu
LRM
31
1
0
23 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
35
1
0
06 Oct 2024
Enhancing and Accelerating Large Language Models via Instruction-Aware
  Contextual Compression
Enhancing and Accelerating Large Language Models via Instruction-Aware Contextual Compression
Haowen Hou
Fei Ma
Binwen Bai
Xinxin Zhu
Fei Yu
33
0
0
28 Aug 2024
A Review of the Challenges with Massive Web-mined Corpora Used in Large
  Language Models Pre-Training
A Review of the Challenges with Massive Web-mined Corpora Used in Large Language Models Pre-Training
Michał Perełkiewicz
Rafał Poświata
45
1
0
10 Jul 2024
Data, Data Everywhere: A Guide for Pretraining Dataset Construction
Data, Data Everywhere: A Guide for Pretraining Dataset Construction
Jupinder Parmar
Shrimai Prabhumoye
Joseph Jennings
Bo Liu
Aastha Jhunjhunwala
Zhilin Wang
M. Patwary
M. Shoeybi
Bryan Catanzaro
50
5
0
08 Jul 2024
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer
  Architectures and Cross-dataset Stem Augmentation
YourMT3+: Multi-instrument Music Transcription with Enhanced Transformer Architectures and Cross-dataset Stem Augmentation
Sungkyun Chang
Emmanouil Benetos
Holger Kirchhoff
Simon Dixon
37
2
0
05 Jul 2024
Uni-Mol2: Exploring Molecular Pretraining Model at Scale
Uni-Mol2: Exploring Molecular Pretraining Model at Scale
Xiaohong Ji
Zhen Wang
Zhifeng Gao
Hang Zheng
Linfeng Zhang
Guolin Ke
Weinan E
AI4CE
48
6
0
21 Jun 2024
Does Diffusion Beat GAN in Image Super Resolution?
Does Diffusion Beat GAN in Image Super Resolution?
Denis Kuznedelev
Valerii Startsev
Daniil Shlenskii
Sergey Kastryulin
44
4
0
27 May 2024
SambaLingo: Teaching Large Language Models New Languages
SambaLingo: Teaching Large Language Models New Languages
Zoltan Csaki
Bo Li
Jonathan Li
Qiantong Xu
Pian Pawakapan
Leon Zhang
Yun Du
Hengyu Zhao
Changran Hu
Urmish Thakker
37
6
0
08 Apr 2024
Transcribing Bengali Text with Regional Dialects to IPA using District
  Guided Tokens
Transcribing Bengali Text with Regional Dialects to IPA using District Guided Tokens
Jishanul Islam
Sadia Ahmmed
Sahid Hossain
11
0
0
26 Mar 2024
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual
  Language Modeling
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
Tomasz Limisiewicz
Terra Blevins
Hila Gonen
Orevaoghene Ahia
Luke Zettlemoyer
30
13
0
15 Mar 2024
IRCoder: Intermediate Representations Make Language Models Robust
  Multilingual Code Generators
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators
Indraneil Paul
Goran Glavas
Iryna Gurevych
40
13
0
06 Mar 2024
Aya Model: An Instruction Finetuned Open-Access Multilingual Language
  Model
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
Ahmet Üstün
Viraat Aryabumi
Zheng-Xin Yong
Wei-Yin Ko
Daniel D'souza
...
Shayne Longpre
Niklas Muennighoff
Marzieh Fadaee
Julia Kreutzer
Sara Hooker
ALM
ELM
SyDa
LRM
35
194
0
12 Feb 2024
OLMo: Accelerating the Science of Language Models
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld
Iz Beltagy
Pete Walsh
Akshita Bhagia
Rodney Michael Kinney
...
Jesse Dodge
Kyle Lo
Luca Soldaini
Noah A. Smith
Hanna Hajishirzi
OSLM
141
358
0
01 Feb 2024
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Fuzhao Xue
Zian Zheng
Yao Fu
Jinjie Ni
Zangwei Zheng
Wangchunshu Zhou
Yang You
MoE
30
87
0
29 Jan 2024
LangBridge: Multilingual Reasoning Without Multilingual Supervision
LangBridge: Multilingual Reasoning Without Multilingual Supervision
Dongkeun Yoon
Joel Jang
Sungdong Kim
Seungone Kim
Sheikh Shafayat
Minjoon Seo
LRM
24
14
0
19 Jan 2024
Breaking the Curse of Multilinguality with Cross-lingual Expert Language
  Models
Breaking the Curse of Multilinguality with Cross-lingual Expert Language Models
Terra Blevins
Tomasz Limisiewicz
Suchin Gururangan
Margaret Li
Hila Gonen
Noah A. Smith
Luke Zettlemoyer
50
22
0
19 Jan 2024
MERA: A Comprehensive LLM Evaluation in Russian
MERA: A Comprehensive LLM Evaluation in Russian
Alena Fenogenova
Artem Chervyakov
Nikita Martynov
Anastasia Kozlova
Maria Tikhonova
...
Nikita Savushkin
Polina Mikhailova
Denis Dimitrov
Alexander Panchenko
Sergey Markov
ELM
39
10
0
09 Jan 2024
PersianLLaMA: Towards Building First Persian Large Language Model
PersianLLaMA: Towards Building First Persian Large Language Model
Mohammad Amin Abbasi
A. Ghafouri
Mahdi Firouzmandi
Hassan Naderi
B. Minaei-Bidgoli
27
9
0
25 Dec 2023
Paloma: A Benchmark for Evaluating Language Model Fit
Paloma: A Benchmark for Evaluating Language Model Fit
Ian H. Magnusson
Akshita Bhagia
Valentin Hofmann
Luca Soldaini
A. Jha
...
Iz Beltagy
Hanna Hajishirzi
Noah A. Smith
Kyle Richardson
Jesse Dodge
132
21
0
16 Dec 2023
Pipeline and Dataset Generation for Automated Fact-checking in Almost
  Any Language
Pipeline and Dataset Generation for Automated Fact-checking in Almost Any Language
Jan Drchal
Herbert Ullrich
Tomás Mlynár
Václav Moravec
HILM
21
1
0
15 Dec 2023
Order Matters in the Presence of Dataset Imbalance for Multilingual
  Learning
Order Matters in the Presence of Dataset Imbalance for Multilingual Learning
Dami Choi
Derrick Xin
Hamid Dadkhahi
Justin Gilmer
Ankush Garg
Orhan Firat
Chih-Kuan Yeh
Andrew M. Dai
Behrooz Ghorbani
55
3
0
11 Dec 2023
GreekT5: A Series of Greek Sequence-to-Sequence Models for News
  Summarization
GreekT5: A Series of Greek Sequence-to-Sequence Models for News Summarization
Nikolaos Giarelis
Charalampos Mastrokostas
N. Karacapilidis
29
2
0
13 Nov 2023
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Sneha Kudugunta
Isaac Caswell
Biao Zhang
Xavier Garcia
Christopher A. Choquette-Choo
...
Derrick Xin
Aditya Kusupati
Romi Stella
Ankur Bapna
Orhan Firat
67
118
0
09 Sep 2023
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
Guijin Son
Hanwool Albert Lee
Suwan Kim
Huiseo Kim
Jaecheol Lee
Je Won Yeom
Jihyu Jung
Jung Woo Kim
Songseong Kim
RALM
ELM
26
20
0
06 Sep 2023
LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models
  Fine-tuning
LoRA-FA: Memory-efficient Low-rank Adaptation for Large Language Models Fine-tuning
Longteng Zhang
Lin Zhang
S. Shi
X. Chu
Bo-wen Li
AI4CE
18
91
0
07 Aug 2023
A Pretrainer's Guide to Training Data: Measuring the Effects of Data
  Age, Domain Coverage, Quality, & Toxicity
A Pretrainer's Guide to Training Data: Measuring the Effects of Data Age, Domain Coverage, Quality, & Toxicity
Shayne Longpre
Gregory Yauney
Emily Reif
Katherine Lee
Adam Roberts
...
Denny Zhou
Jason W. Wei
Kevin Robinson
David M. Mimno
Daphne Ippolito
23
148
0
22 May 2023
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for
  Longer Sequences
mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences
David C. Uthus
Santiago Ontañón
Joshua Ainslie
Mandy Guo
VLM
28
10
0
18 May 2023
Measuring Cross-Lingual Transferability of Multilingual Transformers on
  Sentence Classification
Measuring Cross-Lingual Transferability of Multilingual Transformers on Sentence Classification
Zewen Chi
Heyan Huang
Xian-Ling Mao
70
0
0
15 May 2023
Measuring The Impact Of Programming Language Distribution
Measuring The Impact Of Programming Language Distribution
Gabriel Orlanski
Kefan Xiao
Xavier Garcia
Jeffrey Hui
Joshua Howland
J. Malmaud
Jacob Austin
Rishah Singh
Michele Catasta
30
28
0
03 Feb 2023
Systematic Inequalities in Language Technology Performance across the
  World's Languages
Systematic Inequalities in Language Technology Performance across the World's Languages
Damián E. Blasi
Antonios Anastasopoulos
Graham Neubig
127
131
0
13 Oct 2021
Deduplicating Training Data Makes Language Models Better
Deduplicating Training Data Makes Language Models Better
Katherine Lee
Daphne Ippolito
A. Nystrom
Chiyuan Zhang
Douglas Eck
Chris Callison-Burch
Nicholas Carlini
SyDa
242
593
0
14 Jul 2021
Larger-Scale Transformers for Multilingual Masked Language Modeling
Larger-Scale Transformers for Multilingual Masked Language Modeling
Naman Goyal
Jingfei Du
Myle Ott
Giridhar Anantharaman
Alexis Conneau
90
98
0
02 May 2021
Extracting Training Data from Large Language Models
Extracting Training Data from Large Language Models
Nicholas Carlini
Florian Tramèr
Eric Wallace
Matthew Jagielski
Ariel Herbert-Voss
...
Tom B. Brown
D. Song
Ulfar Erlingsson
Alina Oprea
Colin Raffel
MLAU
SILM
290
1,815
0
14 Dec 2020
Improving Multilingual Models with Language-Clustered Vocabularies
Improving Multilingual Models with Language-Clustered Vocabularies
Hyung Won Chung
Dan Garrette
Kiat Chuan Tan
Jason Riesa
VLM
77
65
0
24 Oct 2020
MLQA: Evaluating Cross-lingual Extractive Question Answering
MLQA: Evaluating Cross-lingual Extractive Question Answering
Patrick Lewis
Barlas Oğuz
Ruty Rinott
Sebastian Riedel
Holger Schwenk
ELM
246
492
0
16 Oct 2019
Language Models as Knowledge Bases?
Language Models as Knowledge Bases?
Fabio Petroni
Tim Rocktaschel
Patrick Lewis
A. Bakhtin
Yuxiang Wu
Alexander H. Miller
Sebastian Riedel
KELM
AI4MH
417
2,588
0
03 Sep 2019
1