Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.12300
Cited By
HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models
18 May 2025
Weixuan Wang
Minghao Wu
Barry Haddow
Alexandra Birch
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"HBO: Hierarchical Balancing Optimization for Fine-Tuning Large Language Models"
46 / 46 papers shown
Title
Gemma 3 Technical Report
Gemma Team
Aishwarya B Kamath
Johan Ferret
Shreya Pathak
Nino Vieillard
...
Harshal Tushar Lehri
Hussein Hazimeh
Ian Ballantyne
Idan Szpektor
Ivan Nardini
VLM
187
136
0
25 Mar 2025
EuroLLM: Multilingual Language Models for Europe
Pedro Henrique Martins
Patrick Fernandes
Joao Alves
Nuno M. Guerreiro
Ricardo Rei
...
Pierre Colombo
Barry Haddow
José G. C. de Souza
Alexandra Birch
André F. T. Martins
80
39
0
24 Sep 2024
Dynamic Data Mixing Maximizes Instruction Tuning for Mixture-of-Experts
Tong Zhu
Daize Dong
Xiaoye Qu
Jiacheng Ruan
Wenliang Chen
Yu Cheng
MoE
87
9
0
17 Jun 2024
Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models
Minghao Wu
Thuy-Trang Vu
Zhuang Li
Gholamreza Haffari
66
6
0
13 Jun 2024
UltraMedical: Building Specialized Generalists in Biomedicine
Kaiyan Zhang
Sihang Zeng
Ermo Hua
Ning Ding
Zhang-Ren Chen
...
Xuekai Zhu
Xingtai Lv
Hu Jinfang
Zhiyuan Liu
Bowen Zhou
LM&MA
88
33
0
06 Jun 2024
Aya 23: Open Weight Releases to Further Multilingual Progress
Viraat Aryabumi
John Dang
Dwarak Talupuru
Saurabh Dash
David Cairuz
...
Aidan Gomez
Phil Blunsom
Marzieh Fadaee
Ahmet Üstün
Sara Hooker
OSLM
103
86
0
23 May 2024
WildChat: 1M ChatGPT Interaction Logs in the Wild
Wenting Zhao
Xiang Ren
Jack Hessel
Claire Cardie
Yejin Choi
Yuntian Deng
86
230
0
02 May 2024
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Shivalika Singh
Freddie Vargus
Daniel D'souza
Börje F. Karlsson
Abinaya Mahendiran
...
Max Bartolo
Julia Kreutzer
Ahmet Üstün
Marzieh Fadaee
Sara Hooker
205
126
0
09 Feb 2024
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
Ming Li
Yong Zhang
Shwai He
Zhitao Li
Hongyu Zhao
Jianzong Wang
Ning Cheng
Dinesh Manocha
96
80
0
01 Feb 2024
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Uri Shaham
Jonathan Herzig
Roee Aharoni
Idan Szpektor
Reut Tsarfaty
Matan Eyal
LRM
93
52
0
03 Jan 2024
What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning
Wei Liu
Weihao Zeng
Keqing He
Yong Jiang
Junxian He
ALM
101
239
0
25 Dec 2023
Turning English-centric LLMs Into Polyglots: How Much Multilinguality Is Needed?
Tannon Kew
Florian Schottmann
Rico Sennrich
LRM
77
40
0
20 Dec 2023
Efficient Online Data Mixing For Language Model Pre-Training
Alon Albalak
Liangming Pan
Colin Raffel
Wenjie Wang
78
45
0
05 Dec 2023
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
L. Yu
Weisen Jiang
Han Shi
Jincheng Yu
Zhengying Liu
Yu Zhang
James T. Kwok
Zheng Li
Adrian Weller
Weiyang Liu
OSLM
LRM
106
395
0
21 Sep 2023
LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset
Lianmin Zheng
Wei-Lin Chiang
Ying Sheng
Tianle Li
Siyuan Zhuang
...
Zi Lin
Eric P. Xing
Joseph E. Gonzalez
Ion Stoica
Haotong Zhang
97
221
0
21 Sep 2023
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Pinzhen Chen
Shaoxiong Ji
Nikolay Bogoychev
Andrey Kutuzov
Barry Haddow
Kenneth Heafield
83
47
0
16 Sep 2023
From Quantity to Quality: Boosting LLM Performance with Self-Guided Data Selection for Instruction Tuning
Ming Li
Yong Zhang
Zhitao Li
Jiuhai Chen
Lichang Chen
Ning Cheng
Jianzong Wang
Dinesh Manocha
Jing Xiao
110
212
0
23 Aug 2023
A Preliminary Study of the Intrinsic Relationship between Complexity and Alignment
Ying Zhao
Yu Bowen
Binyuan Hui
Haiyang Yu
Fei Huang
Yongbin Li
N. Zhang
103
24
0
10 Aug 2023
Language Versatilists vs. Specialists: An Empirical Revisiting on Multilingual Transfer Ability
Jiacheng Ye
Xijia Tao
Lingpeng Kong
LRM
64
27
0
11 Jun 2023
Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond
Jingfeng Yang
Hongye Jin
Ruixiang Tang
Xiaotian Han
Qizhang Feng
Haoming Jiang
Bing Yin
Helen Zhou
LM&MA
206
676
0
26 Apr 2023
OPT-IML: Scaling Language Model Instruction Meta Learning through the Lens of Generalization
Srinivasan Iyer
Xi Lin
Ramakanth Pasunuru
Todor Mihaylov
Daniel Simig
...
Jeff Wang
Christopher Dewan
Asli Celikyilmaz
Luke Zettlemoyer
Veselin Stoyanov
ALM
146
267
0
22 Dec 2022
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model
BigScience Workshop
:
Teven Le Scao
Angela Fan
Christopher Akiki
...
Zhongli Xie
Zifan Ye
M. Bras
Younes Belkada
Thomas Wolf
VLM
408
2,393
0
09 Nov 2022
Scaling Instruction-Finetuned Language Models
Hyung Won Chung
Le Hou
Shayne Longpre
Barret Zoph
Yi Tay
...
Jacob Devlin
Adam Roberts
Denny Zhou
Quoc V. Le
Jason W. Wei
ReLM
LRM
231
3,158
0
20 Oct 2022
On the Limitations of Dataset Balancing: The Lost Battle Against Spurious Correlations
Roy Schwartz
Gabriel Stanovsky
94
26
0
27 Apr 2022
MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
Ankit Pal
Logesh Kumar Umapathi
Malaikannan Sankarasubbu
ELM
LM&MA
87
348
0
27 Mar 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
350
4,596
0
27 Oct 2021
Multitask Prompted Training Enables Zero-Shot Task Generalization
Victor Sanh
Albert Webson
Colin Raffel
Stephen H. Bach
Lintang Sutawika
...
T. Bers
Stella Biderman
Leo Gao
Thomas Wolf
Alexander M. Rush
LRM
355
1,709
0
15 Oct 2021
Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training
Minghao Wu
Yitong Li
Meng Zhang
Liangyou Li
Gholamreza Haffari
Qun Liu
72
22
0
06 Sep 2021
Finetuned Language Models Are Zero-Shot Learners
Jason W. Wei
Maarten Bosma
Vincent Zhao
Kelvin Guu
Adams Wei Yu
Brian Lester
Nan Du
Andrew M. Dai
Quoc V. Le
ALM
UQCV
246
3,789
0
03 Sep 2021
Measuring Mathematical Problem Solving With the MATH Dataset
Dan Hendrycks
Collin Burns
Saurav Kadavath
Akul Arora
Steven Basart
Eric Tang
Basel Alomair
Jacob Steinhardt
ReLM
FaML
183
2,405
0
05 Mar 2021
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models
Zirui Wang
Yulia Tsvetkov
Orhan Firat
Yuan Cao
74
202
0
12 Oct 2020
Gamma distribution-based sampling for imbalanced data
Firuz Kamalov
Dmitry Denisov
76
43
0
22 Sep 2020
Multi-Task Learning with Deep Neural Networks: A Survey
M. Crawshaw
CVBM
217
625
0
10 Sep 2020
Measuring Massive Multitask Language Understanding
Dan Hendrycks
Collin Burns
Steven Basart
Andy Zou
Mantas Mazeika
Basel Alomair
Jacob Steinhardt
ELM
RALM
187
4,572
0
07 Sep 2020
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning
Edoardo Ponti
Goran Glavaš
Olga Majewska
Qianchu Liu
Ivan Vulić
Anna Korhonen
LRM
86
328
0
01 May 2020
Balancing Training for Multilingual Neural Machine Translation
Xinyi Wang
Yulia Tsvetkov
Graham Neubig
106
101
0
14 Apr 2020
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
189
1,230
0
19 Jan 2020
Optimizing Data Usage via Differentiable Rewards
Xinyi Wang
Hieu H. Pham
Paul Michel
Antonios Anastasopoulos
J. Carbonell
Graham Neubig
78
62
0
22 Nov 2019
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
228
6,593
0
05 Nov 2019
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
488
20,342
0
23 Oct 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Zhiwen Chen
Yonghui Wu
96
428
0
11 Jul 2019
Dynamic Curriculum Learning for Imbalanced Data Classification
Yiru Wang
Weihao Gan
Jie Yang
Wei Wu
Junjie Yan
81
222
0
21 Jan 2019
XNLI: Evaluating Cross-lingual Sentence Representations
Alexis Conneau
Guillaume Lample
Ruty Rinott
Adina Williams
Samuel R. Bowman
Holger Schwenk
Veselin Stoyanov
ELM
90
1,388
0
13 Sep 2018
GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks
Zhao Chen
Vijay Badrinarayanan
Chen-Yu Lee
Andrew Rabinovich
ODL
173
1,293
0
07 Nov 2017
An Overview of Multi-Task Learning in Deep Neural Networks
Sebastian Ruder
CVBM
159
2,831
0
15 Jun 2017
Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
Alex Kendall
Y. Gal
R. Cipolla
3DH
272
3,136
0
19 May 2017
1