Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2010.05874
Cited By
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models
12 October 2020
Zirui Wang
Yulia Tsvetkov
Orhan Firat
Yuan Cao
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models"
49 / 49 papers shown
Title
BoundarySeg:An Embarrassingly Simple Method To Boost Medical Image Segmentation Performance for Low Data Regimes
Tushar Kataria
Shireen Y. Elhabian
29
0
0
14 May 2025
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
Jiangnan Li
Thuy-Trang Vu
Christian Herold
Amirhossein Tebbifakhr
Shahram Khadivi
Gholamreza Haffari
33
0
0
31 Mar 2025
Transforming Vision Transformer: Towards Efficient Multi-Task Asynchronous Learning
Hanwen Zhong
Jiaxin Chen
Yutong Zhang
Di Huang
Yunhong Wang
MoE
42
0
0
12 Jan 2025
Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate
Zhiqi Bu
Xiaomeng Jin
Bhanukiran Vinzamuri
Anil Ramakrishna
Kai-Wei Chang
V. Cevher
Mingyi Hong
MU
88
6
0
29 Oct 2024
Swiss Army Knife: Synergizing Biases in Knowledge from Vision Foundation Models for Multi-Task Learning
Yuxiang Lu
Shengcao Cao
Yu-xiong Wang
55
1
0
18 Oct 2024
Upsample or Upweight? Balanced Training on Heavily Imbalanced Datasets
Tianjian Li
Haoran Xu
Weiting Tan
Kenton Murray
Daniel Khashabi
35
1
0
06 Oct 2024
X-ALMA: Plug & Play Modules and Adaptive Rejection for Quality Translation at Scale
Haoran Xu
Kenton W. Murray
Philipp Koehn
Hieu T. Hoang
Akiko Eriguchi
Huda Khayrallah
34
8
0
04 Oct 2024
Task-Adaptive Pretrained Language Models via Clustered-Importance Sampling
David Grangier
Simin Fan
Skyler Seto
Pierre Ablin
44
3
0
30 Sep 2024
Can Optimization Trajectories Explain Multi-Task Transfer?
David Mueller
Mark Dredze
Nicholas Andrews
61
1
0
26 Aug 2024
Customizing Language Models with Instance-wise LoRA for Sequential Recommendation
Xiaoyu Kong
Jiancan Wu
An Zhang
Leheng Sheng
Hui Lin
Xiang Wang
Xiangnan He
AI4TS
58
7
0
19 Aug 2024
Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences
Nikolaos Dimitriadis
Pascal Frossard
F. Fleuret
MoE
67
6
0
10 Jul 2024
Towards Modular LLMs by Building and Reusing a Library of LoRAs
O. Ostapenko
Zhan Su
E. Ponti
Laurent Charlin
Nicolas Le Roux
Matheus Pereira
Lucas Caccia
Alessandro Sordoni
MoMe
44
31
0
18 May 2024
Robust Analysis of Multi-Task Learning Efficiency: New Benchmarks on Light-Weighed Backbones and Effective Measurement of Multi-Task Learning Challenges by Feature Disentanglement
Dayou Mao
Yuhao Chen
Yifan Wu
Maximilian Gilles
Alexander Wong
AAML
41
0
0
05 Feb 2024
Careful with that Scalpel: Improving Gradient Surgery with an EMA
Yu-Guan Hsieh
James Thornton
Eugène Ndiaye
Michal Klein
Marco Cuturi
Pierre Ablin
MedIm
39
0
0
05 Feb 2024
A First-Order Multi-Gradient Algorithm for Multi-Objective Bi-Level Optimization
Feiyang Ye
Baijiong Lin
Xiao-Qun Cao
Yu Zhang
Ivor Tsang
50
6
0
17 Jan 2024
GradSim: Gradient-Based Language Grouping for Effective Multilingual Training
Mingyang Wang
Heike Adel
Lukas Lange
Jannik Strötgen
Hinrich Schütze
30
3
0
23 Oct 2023
Adaptive Neural Ranking Framework: Toward Maximized Business Goal for Cascade Ranking Systems
Yunli Wang
Zhiqiang Wang
Jian Yang
Shiyang Wen
Dongying Kong
Han Li
Kun Gai
32
10
0
16 Oct 2023
Transformer-based Multimodal Change Detection with Multitask Consistency Constraints
Biyuan Liu
Huaixin Chen
Kun Li
Michael Ying Yang
38
14
0
13 Oct 2023
Deep Task-specific Bottom Representation Network for Multi-Task Recommendation
Qi Liu
Zhilong Zhou
Gangwei Jiang
T. Ge
Defu Lian
20
12
0
11 Aug 2023
TaskExpert: Dynamically Assembling Multi-Task Representations with Memorial Mixture-of-Experts
Hanrong Ye
Dan Xu
MoE
42
26
0
28 Jul 2023
Fairness in Multi-Task Learning via Wasserstein Barycenters
Franccois Hu
Philipp Ratz
Arthur Charpentier
37
10
0
16 Jun 2023
Addressing Negative Transfer in Diffusion Models
Hyojun Go
Jinyoung Kim
Yunsung Lee
Seunghyun Lee
Shinhyeok Oh
Hyeongdon Moon
Seungtaek Choi
DiffM
VLM
32
24
0
01 Jun 2023
Exploring Representational Disparities Between Multilingual and Bilingual Translation Models
Neha Verma
Kenton W. Murray
Kevin Duh
14
0
0
23 May 2023
FedAds: A Benchmark for Privacy-Preserving CVR Estimation with Vertical Federated Learning
Penghui Wei
Hongjian Dou
Shaoguo Liu
Rong Tang
Li Liu
Liangji Wang
Bo Zheng
FedML
24
12
0
15 May 2023
KINLP at SemEval-2023 Task 12: Kinyarwanda Tweet Sentiment Analysis
Antoine Nzeyimana
20
3
0
25 Apr 2023
UniMax: Fairer and more Effective Language Sampling for Large-Scale Multilingual Pretraining
Hyung Won Chung
Noah Constant
Xavier Garcia
Adam Roberts
Yi Tay
Sharan Narang
Orhan Firat
26
50
0
18 Apr 2023
On the Pareto Front of Multilingual Neural Machine Translation
Liang Chen
Shuming Ma
Dongdong Zhang
Furu Wei
Baobao Chang
MoE
23
5
0
06 Apr 2023
Modular Deep Learning
Jonas Pfeiffer
Sebastian Ruder
Ivan Vulić
E. Ponti
MoMe
OOD
32
73
0
22 Feb 2023
Scaling Laws for Multilingual Neural Machine Translation
Patrick Fernandes
Behrooz Ghorbani
Xavier Garcia
Markus Freitag
Orhan Firat
38
29
0
19 Feb 2023
GAT: Guided Adversarial Training with Pareto-optimal Auxiliary Tasks
Salah Ghamizi
Jingfeng Zhang
Maxime Cordy
Mike Papadakis
Masashi Sugiyama
Yves Le Traon
AAML
28
2
0
06 Feb 2023
FairRoad: Achieving Fairness for Recommender Systems with Optimized Antidote Data
Minghong Fang
Jia-Wei Liu
Michinari Momma
Yi Sun
30
4
0
13 Dec 2022
M
3
^3
3
ViT: Mixture-of-Experts Vision Transformer for Efficient Multi-task Learning with Model-Accelerator Co-design
Hanxue Liang
Zhiwen Fan
Rishov Sarkar
Ziyu Jiang
Tianlong Chen
Kai Zou
Yu Cheng
Cong Hao
Zhangyang Wang
MoE
42
81
0
26 Oct 2022
PaCo: Parameter-Compositional Multi-Task Reinforcement Learning
Lingfeng Sun
Haichao Zhang
Wei-ping Xu
Masayoshi Tomizuka
MoE
30
36
0
21 Oct 2022
Personalizing Intervened Network for Long-tailed Sequential User Behavior Modeling
Zheqi Lv
Feng Wang
Shengyu Zhang
Kun Kuang
Hongxia Yang
Fei Wu
37
8
0
19 Aug 2022
LibMTL: A Python Library for Multi-Task Learning
Baijiong Lin
Yu Zhang
OffRL
AI4CE
25
37
0
27 Mar 2022
X-Learner: Learning Cross Sources and Tasks for Universal Visual Representation
Yinan He
Gengshi Huang
Siyu Chen
Jianing Teng
Wang Kun
Zhen-fei Yin
Lu Sheng
Ziwei Liu
Yu Qiao
Jing Shao
VLM
SSL
ViT
43
7
0
16 Mar 2022
Combining Modular Skills in Multitask Learning
E. Ponti
Alessandro Sordoni
Yoshua Bengio
Siva Reddy
MoE
12
37
0
28 Feb 2022
Structured Multi-task Learning for Molecular Property Prediction
Shengchao Liu
Meng Qu
Zuobai Zhang
Huiyu Cai
Jian Tang
17
24
0
22 Feb 2022
mSLAM: Massively multilingual joint pre-training for speech and text
Ankur Bapna
Colin Cherry
Yu Zhang
Ye Jia
Melvin Johnson
Yong Cheng
Simran Khanuja
Jason Riesa
Alexis Conneau
VLM
24
111
0
03 Feb 2022
In Defense of the Unitary Scalarization for Deep Multi-Task Learning
Vitaly Kurin
Alessandro De Palma
Ilya Kostrikov
Shimon Whiteson
M. P. Kumar
39
73
0
11 Jan 2022
Speech Representation Learning Through Self-supervised Pretraining And Multi-task Finetuning
Yi-Chen Chen
Shu-Wen Yang
Cheng-Kuang Lee
Simon See
Hung-yi Lee
SSL
19
12
0
18 Oct 2021
Sequential Reptile: Inter-Task Gradient Alignment for Multilingual Learning
Seanie Lee
Haebeom Lee
Juho Lee
Sung Ju Hwang
MoMe
CLL
45
16
0
06 Oct 2021
A Conditional Generative Matching Model for Multi-lingual Reply Suggestion
Budhaditya Deb
Guoqing Zheng
Milad Shokouhi
Ahmed Hassan Awadallah
31
1
0
15 Sep 2021
Domain Generalization via Gradient Surgery
Lucas Mansilla
Rodrigo Echeveste
Diego H. Milone
Enzo Ferrante
OOD
24
78
0
03 Aug 2021
Scaling End-to-End Models for Large-Scale Multilingual ASR
Bo-wen Li
Ruoming Pang
Tara N. Sainath
Anmol Gulati
Yu Zhang
James Qin
Parisa Haghani
Yifan Jiang
Min Ma
Junwen Bai
CLL
34
76
0
30 Apr 2021
RotoGrad: Gradient Homogenization in Multitask Learning
Adrián Javaloy
Isabel Valera
21
86
0
03 Mar 2021
Measuring and Harnessing Transference in Multi-Task Learning
Christopher Fifty
Ehsan Amid
Zhe Zhao
Tianhe Yu
Rohan Anil
Chelsea Finn
28
15
0
29 Oct 2020
Investigating Multilingual NMT Representations at Scale
Sneha Kudugunta
Ankur Bapna
Isaac Caswell
N. Arivazhagan
Orhan Firat
LRM
144
120
0
05 Sep 2019
Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism
Orhan Firat
Kyunghyun Cho
Yoshua Bengio
LRM
AIMat
231
623
0
06 Jan 2016
1