Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2503.23777
Cited By
CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment
31 March 2025
Jiangnan Li
Thuy-Trang Vu
Christian Herold
Amirhossein Tebbifakhr
Shahram Khadivi
Gholamreza Haffari
Re-assign community
ArXiv
PDF
HTML
Papers citing
"CONGRAD:Conflicting Gradient Filtering for Multilingual Preference Alignment"
43 / 43 papers shown
Title
Implicit Cross-Lingual Rewarding for Efficient Multilingual Preference Alignment
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
111
1
0
06 Mar 2025
GPT-4o System Card
OpenAI OpenAI
:
Aaron Hurst
Adam Lerer
Adam P. Goucher
...
Yuchen He
Yuchen Zhang
Yujia Jin
Yunxing Dai
Yury Malkov
MLLM
160
893
0
25 Oct 2024
M-RewardBench: Evaluating Reward Models in Multilingual Settings
Srishti Gureja
Lester James V. Miranda
Shayekh Bin Islam
Rishabh Maheshwary
Drishti Sharma
Gusti Winata
Nathan Lambert
Sebastian Ruder
Sara Hooker
Marzieh Fadaee
LRM
84
19
0
20 Oct 2024
Language Imbalance Driven Rewarding for Multilingual Self-improving
Wen Yang
Junhong Wu
Chen Wang
Chengqing Zong
J.N. Zhang
ALM
LRM
150
7
0
11 Oct 2024
Gemma 2: Improving Open Language Models at a Practical Size
Gemma Team
Gemma Team Morgane Riviere
Shreya Pathak
Pier Giuseppe Sessa
Cassidy Hardin
...
Noah Fiedel
Armand Joulin
Kathleen Kenealy
Robert Dadashi
Alek Andreev
VLM
MoE
OSLM
100
841
0
31 Jul 2024
RLHF Can Speak Many Languages: Unlocking Multilingual Preference Optimization for LLMs
John Dang
Arash Ahmadian
Kelly Marchisio
Julia Kreutzer
Ahmet Üstün
Sara Hooker
78
27
0
02 Jul 2024
DART-Math: Difficulty-Aware Rejection Tuning for Mathematical Problem-Solving
Yuxuan Tong
Xiwen Zhang
Rui Wang
R. Wu
Junxian He
AIMat
LRM
62
39
0
18 Jun 2024
Iterative Length-Regularized Direct Preference Optimization: A Case Study on Improving 7B Language Models to GPT-4 Level
Jie Liu
Zhanhui Zhou
Jiaheng Liu
Xingyuan Bu
Chao Yang
Han-Sen Zhong
Wanli Ouyang
45
19
0
17 Jun 2024
Mixture-of-Skills: Learning to Optimize Data Usage for Fine-Tuning Large Language Models
Minghao Wu
Thuy-Trang Vu
Zhuang Li
Gholamreza Haffari
54
6
0
13 Jun 2024
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework
Jian Hu
Xibin Wu
Weixun Wang
OpenLLMAI Team
Dehao Zhang
Yu Cao
AI4CE
VLM
69
118
0
20 May 2024
Iterative Reasoning Preference Optimization
Richard Yuanzhe Pang
Weizhe Yuan
Kyunghyun Cho
He He
Sainbayar Sukhbaatar
Jason Weston
LRM
90
126
0
30 Apr 2024
Filtered Direct Preference Optimization
Tetsuro Morimura
Mitsuki Sakamoto
Yuu Jinnai
Kenshi Abe
Kaito Air
75
13
0
22 Apr 2024
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages
Yuan Zhang
Yile Wang
Zijun Liu
Shuo Wang
Xiaolong Wang
Peng Li
Maosong Sun
Yang Liu
LRM
80
14
0
19 Feb 2024
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
Shivalika Singh
Freddie Vargus
Daniel D'souza
Börje F. Karlsson
Abinaya Mahendiran
...
Max Bartolo
Julia Kreutzer
Ahmet Üstün
Marzieh Fadaee
Sara Hooker
170
122
0
09 Feb 2024
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia
Sadhika Malladi
Suchin Gururangan
Sanjeev Arora
Danqi Chen
123
231
0
06 Feb 2024
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Yongchang Hao
Yanshuai Cao
Lili Mou
51
50
0
05 Feb 2024
Self-Rewarding Language Models
Weizhe Yuan
Richard Yuanzhe Pang
Kyunghyun Cho
Xian Li
Sainbayar Sukhbaatar
Jing Xu
Jason Weston
ReLM
SyDa
ALM
LRM
304
321
0
18 Jan 2024
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Uri Shaham
Jonathan Herzig
Roee Aharoni
Idan Szpektor
Reut Tsarfaty
Matan Eyal
LRM
55
49
0
03 Jan 2024
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Zixiang Chen
Yihe Deng
Huizhuo Yuan
Kaixuan Ji
Quanquan Gu
SyDa
78
311
0
02 Jan 2024
Improving In-context Learning of Multilingual Generative Language Models with Cross-lingual Alignment
Chong Li
Shaonan Wang
Jiajun Zhang
Chengqing Zong
50
20
0
14 Nov 2023
Multilingual Jailbreak Challenges in Large Language Models
Yue Deng
Wenxuan Zhang
Sinno Jialin Pan
Lidong Bing
AAML
85
131
0
10 Oct 2023
Understanding the Effects of RLHF on LLM Generalisation and Diversity
Robert Kirk
Ishita Mediratta
Christoforos Nalmpantis
Jelena Luketina
Eric Hambro
Edward Grefenstette
Roberta Raileanu
AI4CE
ALM
150
145
0
10 Oct 2023
Monolingual or Multilingual Instruction Tuning: Which Makes a Better Alpaca
Pinzhen Chen
Shaoxiong Ji
Nikolay Bogoychev
Andrey Kutuzov
Barry Haddow
Kenneth Heafield
66
47
0
16 Sep 2023
Mitigating the Alignment Tax of RLHF
Yong Lin
Hangyu Lin
Wei Xiong
Shizhe Diao
Zeming Zheng
...
Han Zhao
Nan Jiang
Heng Ji
Yuan Yao
Tong Zhang
MoMe
CLL
58
75
0
12 Sep 2023
Efficient Memory Management for Large Language Model Serving with PagedAttention
Woosuk Kwon
Zhuohan Li
Siyuan Zhuang
Ying Sheng
Lianmin Zheng
Cody Hao Yu
Joseph E. Gonzalez
Haotong Zhang
Ion Stoica
VLM
154
2,163
0
12 Sep 2023
Gender bias and stereotypes in Large Language Models
Hadas Kotek
Rikker Dockum
David Q. Sun
101
226
0
28 Aug 2023
Okapi: Instruction-tuned Large Language Models in Multiple Languages with Reinforcement Learning from Human Feedback
Viet Dac Lai
Chien Van Nguyen
Nghia Trung Ngo
Thuat Nguyen
Franck Dernoncourt
Ryan Rossi
Thien Huu Nguyen
ALM
74
145
0
29 Jul 2023
Llama 2: Open Foundation and Fine-Tuned Chat Models
Hugo Touvron
Louis Martin
Kevin R. Stone
Peter Albert
Amjad Almahairi
...
Sharan Narang
Aurelien Rodriguez
Robert Stojnic
Sergey Edunov
Thomas Scialom
AI4MH
ALM
267
11,791
0
18 Jul 2023
On Evaluating and Mitigating Gender Biases in Multilingual Settings
Aniket Vashishtha
Kabir Ahuja
Sunayana Sitaram
48
24
0
04 Jul 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
313
3,895
0
29 May 2023
GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP
Md. Tawkat Islam Khondaker
Abdul Waheed
El Moatez Billah Nagoudi
Muhammad Abdul-Mageed
ELM
LM&MA
50
69
0
24 May 2023
AlpacaFarm: A Simulation Framework for Methods that Learn from Human Feedback
Yann Dubois
Xuechen Li
Rohan Taori
Tianyi Zhang
Ishaan Gulrajani
Jimmy Ba
Carlos Guestrin
Percy Liang
Tatsunori B. Hashimoto
ALM
108
593
0
22 May 2023
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback
Yuntao Bai
Andy Jones
Kamal Ndousse
Amanda Askell
Anna Chen
...
Jack Clark
Sam McCandlish
C. Olah
Benjamin Mann
Jared Kaplan
239
2,535
0
12 Apr 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
769
12,835
0
04 Mar 2022
Uncertainty-Aware Balancing for Multilingual and Multi-Domain Neural Machine Translation Training
Minghao Wu
Yitong Li
Meng Zhang
Liangyou Li
Gholamreza Haffari
Qun Liu
59
22
0
06 Sep 2021
Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models
Zirui Wang
Yulia Tsvetkov
Orhan Firat
Yuan Cao
65
202
0
12 Oct 2020
Gradient Surgery for Multi-Task Learning
Tianhe Yu
Saurabh Kumar
Abhishek Gupta
Sergey Levine
Karol Hausman
Chelsea Finn
157
1,211
0
19 Jan 2020
Unsupervised Cross-lingual Representation Learning at Scale
Alexis Conneau
Kartikay Khandelwal
Naman Goyal
Vishrav Chaudhary
Guillaume Wenzek
Francisco Guzmán
Edouard Grave
Myle Ott
Luke Zettlemoyer
Veselin Stoyanov
195
6,538
0
05 Nov 2019
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
452
1,717
0
18 Sep 2019
Massively Multilingual Neural Machine Translation in the Wild: Findings and Challenges
N. Arivazhagan
Ankur Bapna
Orhan Firat
Dmitry Lepikhin
Melvin Johnson
...
George F. Foster
Colin Cherry
Wolfgang Macherey
Zhiwen Chen
Yonghui Wu
69
427
0
11 Jul 2019
Think you have Solved Question Answering? Try ARC, the AI2 Reasoning Challenge
Peter Clark
Isaac Cowhey
Oren Etzioni
Tushar Khot
Ashish Sabharwal
Carissa Schoenick
Oyvind Tafjord
ELM
RALM
LRM
146
2,567
0
14 Mar 2018
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
446
18,931
0
20 Jul 2017
A Multifaceted Evaluation of Neural versus Phrase-Based Machine Translation for 9 Language Directions
Antonio Toral
Víctor M. Sánchez-Cartagena
61
147
0
11 Jan 2017
1