ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2403.00409
  4. Cited By
Provably Robust DPO: Aligning Language Models with Noisy Feedback
v1v2 (latest)

Provably Robust DPO: Aligning Language Models with Noisy Feedback

1 March 2024
Sayak Ray Chowdhury
Anush Kini
Nagarajan Natarajan
ArXiv (abs)PDFHTML

Papers citing "Provably Robust DPO: Aligning Language Models with Noisy Feedback"

44 / 44 papers shown
Title
Robust Preference Optimization via Dynamic Target Margins
Robust Preference Optimization via Dynamic Target Margins
Jie Sun
Junkang Wu
Jiancan Wu
Zhibo Zhu
Xingyu Lu
Jun Zhou
Lintao Ma
Xiang Wang
63
0
0
04 Jun 2025
On Symmetric Losses for Robust Policy Optimization with Noisy Preferences
On Symmetric Losses for Robust Policy Optimization with Noisy Preferences
Soichiro Nishimori
Yu Zhang
Thanawat Lodkaew
Masashi Sugiyama
NoLa
44
0
0
30 May 2025
Square$χ$PO: Differentially Private and Robust $χ^2$-Preference Optimization in Offline Direct Alignment
SquareχχχPO: Differentially Private and Robust χ2χ^2χ2-Preference Optimization in Offline Direct Alignment
Xingyu Zhou
Yulian Wu
Wenqian Weng
Francesco Orabona
85
0
0
27 May 2025
Incentivizing High-Quality Human Annotations with Golden Questions
Incentivizing High-Quality Human Annotations with Golden Questions
Shang Liu
Zhongze Cai
Hanzhao Wang
Zhongyao Ma
Xiaocheng Li
82
0
0
25 May 2025
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization
Meng Li
Guangda Huzhang
Haibo Zhang
Xiting Wang
Anxiang Zeng
44
0
0
24 May 2025
MPO: Multilingual Safety Alignment via Reward Gap Optimization
MPO: Multilingual Safety Alignment via Reward Gap Optimization
Weixiang Zhao
Yulin Hu
Yang Deng
Tongtong Wu
Wenxuan Zhang
...
An Zhang
Yanyan Zhao
Bing Qin
Tat-Seng Chua
Ting Liu
102
2
0
22 May 2025
A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO
A Unified Theoretical Analysis of Private and Robust Offline Alignment: from RLHF to DPO
Xingyu Zhou
Yulian Wu
Francesco Orabona
OffRL
107
1
0
21 May 2025
Inducing Robustness in a 2 Dimensional Direct Preference Optimization Paradigm
Inducing Robustness in a 2 Dimensional Direct Preference Optimization Paradigm
Sarvesh Shashidhar
Ritik
Nachiketa Patil
Suraj Racha
Ganesh Ramakrishnan
73
0
0
03 May 2025
ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
ParetoHqD: Fast Offline Multiobjective Alignment of Large Language Models using Pareto High-quality Data
Haoran Gu
Handing Wang
Yi Mei
Mengjie Zhang
Yaochu Jin
75
1
0
23 Apr 2025
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
A Comprehensive Survey of Reward Models: Taxonomy, Applications, Challenges, and Future
Jialun Zhong
Wei Shen
Yanzeng Li
Songyang Gao
Hua Lu
Yicheng Chen
Yang Zhang
Wei Zhou
Jinjie Gu
Lei Zou
LRM
134
11
0
12 Apr 2025
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
Debiasing Multimodal Large Language Models via Noise-Aware Preference Optimization
Zefeng Zhang
Hengzhu Tang
Shuaiyi Nie
Zhenyu Zhang
Yiming Ren
Zhenyang Li
Dawei Yin
Duohe Ma
Tingwen Liu
119
1
0
23 Mar 2025
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO
When Preferences Diverge: Aligning Diffusion Models with Minority-Aware Adaptive DPO
Lefei Zhang
Chen Liu
C. Xu
Kai Hu
Donghao Luo
Chengjie Wang
Yanwei Fu
Yuan Yao
82
0
0
21 Mar 2025
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling
Qiyuan Deng
X. Bai
Kehai Chen
Yaowei Wang
Liqiang Nie
Min Zhang
OffRL
123
0
0
13 Mar 2025
RePO: ReLU-based Preference Optimization
Junkang Wu
Kexin Huang
Xue Wang
Jinyang Gao
Bolin Ding
Jiancan Wu
Xiangnan He
Xiang Wang
110
1
0
10 Mar 2025
PEO: Improving Bi-Factorial Preference Alignment with Post-Training Policy Extrapolation
Yuxuan Liu
105
0
0
03 Mar 2025
Distributionally Robust Reinforcement Learning with Human Feedback
Debmalya Mandal
Paulius Sasnauskas
Goran Radanović
108
3
0
01 Mar 2025
Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
Two Heads Are Better Than One: Dual-Model Verbal Reflection at Inference-Time
Jiazheng Li
Yuxiang Zhou
Junru Lu
Gladys Tyen
Lin Gui
Cesare Aloisi
Yulan He
LRM
104
3
0
26 Feb 2025
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
OneRec: Unifying Retrieve and Rank with Generative Recommender and Iterative Preference Alignment
Jiaxin Deng
Shiyao Wang
Kuo Cai
Lejian Ren
Qigen Hu
Weifeng Ding
Qiang Luo
Guorui Zhou
129
12
0
26 Feb 2025
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond
Qizhou Wang
Jin Peng Zhou
Zhanke Zhou
Saebyeol Shin
Bo Han
Kilian Q. Weinberger
AILawELMMU
145
10
0
26 Feb 2025
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Stackelberg Game Preference Optimization for Data-Efficient Alignment of Language Models
Xu Chu
Zhixin Zhang
Tianyu Jia
Yujie Jin
143
0
0
25 Feb 2025
Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
Iterative Label Refinement Matters More than Preference Optimization under Weak Supervision
Yaowen Ye
Cassidy Laidlaw
Jacob Steinhardt
ALM
78
2
0
14 Jan 2025
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
An Overview and Discussion on Using Large Language Models for Implementation Generation of Solutions to Open-Ended Problems
Hashmath Shaik
Alex Doboli
OffRLELM
467
0
0
31 Dec 2024
Geometric-Averaged Preference Optimization for Soft Preference Labels
Geometric-Averaged Preference Optimization for Soft Preference Labels
Hiroki Furuta
Kuang-Huei Lee
Shixiang Shane Gu
Y. Matsuo
Aleksandra Faust
Heiga Zen
Izzeddin Gur
146
13
0
31 Dec 2024
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
VideoSAVi: Self-Aligned Video Language Models without Human Supervision
Yogesh Kulkarni
Pooyan Fazli
VLM
227
2
0
01 Dec 2024
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Weiyun Wang
Zhe Chen
Wenhai Wang
Yue Cao
Yangzhou Liu
...
Jinguo Zhu
X. Zhu
Lewei Lu
Yu Qiao
Jifeng Dai
LRM
145
93
1
15 Nov 2024
Vision-Language Models Can Self-Improve Reasoning via Reflection
Vision-Language Models Can Self-Improve Reasoning via Reflection
Kanzhi Cheng
Yantao Li
Fangzhi Xu
Jianbing Zhang
Hao Zhou
Yang Liu
ReLMLRM
150
23
0
30 Oct 2024
Optimizing Preference Alignment with Differentiable NDCG Ranking
Optimizing Preference Alignment with Differentiable NDCG Ranking
Jiacong Zhou
Xianyun Wang
Jun Yu
119
2
0
17 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
73
2
0
14 Oct 2024
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Simultaneous Reward Distillation and Preference Learning: Get You a Language Model Who Can Do Both
Abhijnan Nath
Changsoo Jung
Ethan Seefried
Nikhil Krishnaswamy
489
4
0
11 Oct 2024
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning
PoisonBench: Assessing Large Language Model Vulnerability to Data Poisoning
Tingchen Fu
Mrinank Sharma
Philip Torr
Shay B. Cohen
David M. Krueger
Fazl Barez
AAML
131
8
0
11 Oct 2024
Reward Learning From Preference With Ties
Reward Learning From Preference With Ties
Jinsong Liu
Dongdong Ge
Ruihao Zhu
65
3
0
05 Oct 2024
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization
Hanyang Zhao
Genta Indra Winata
Anirban Das
Shi-Xiong Zhang
D. Yao
Wenpin Tang
Sambit Sahu
109
9
0
05 Oct 2024
GameLabel-10K: Collecting Image Preference Data Through Mobile Game
  Crowdsourcing
GameLabel-10K: Collecting Image Preference Data Through Mobile Game Crowdsourcing
Jonathan Zhou
35
0
0
30 Sep 2024
Evaluation of Large Language Models for Summarization Tasks in the
  Medical Domain: A Narrative Review
Evaluation of Large Language Models for Summarization Tasks in the Medical Domain: A Narrative Review
Emma Croxford
Yanjun Gao
Nicholas Pellegrino
Karen K. Wong
Graham Wills
Elliot First
Frank J. Liao
Cherodeep Goswami
Brian Patterson
Majid Afshar
HILMELMLM&MA
129
1
0
26 Sep 2024
Alignment with Preference Optimization Is All You Need for LLM Safety
Alignment with Preference Optimization Is All You Need for LLM Safety
Réda Alami
Ali Khalifa Almansoori
Ahmed Alzubaidi
M. Seddik
Mugariya Farooq
Hakim Hacid
73
1
0
12 Sep 2024
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Alignment of Diffusion Models: Fundamentals, Challenges, and Future
Buhua Liu
Shitong Shao
Bao Li
Lichen Bai
Zhiqiang Xu
Haoyi Xiong
James Kwok
Sumi Helal
Zeke Xie
118
14
0
11 Sep 2024
On the Generalization of Preference Learning with DPO
On the Generalization of Preference Learning with DPO
Shawn Im
Yixuan Li
81
2
0
06 Aug 2024
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Right Now, Wrong Then: Non-Stationary Direct Preference Optimization under Preference Drift
Seongho Son
William Bankes
Sayak Ray Chowdhury
Brooks Paige
Ilija Bogunovic
131
4
0
26 Jul 2024
PORT: Preference Optimization on Reasoning Traces
PORT: Preference Optimization on Reasoning Traces
Salem Lahlou
Abdalgader Abubaker
Hakim Hacid
LRM
120
5
0
23 Jun 2024
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement
  Learning
Preferred-Action-Optimized Diffusion Policies for Offline Reinforcement Learning
Tianle Zhang
Jiayi Guan
Lin Zhao
Yihang Li
Dongjiang Li
...
Lei Sun
Yue Chen
Xuelong Wei
Lusong Li
Xiaodong He
98
2
0
29 May 2024
Soft Preference Optimization: Aligning Language Models to Expert
  Distributions
Soft Preference Optimization: Aligning Language Models to Expert Distributions
Arsalan Sharifnassab
Sina Ghiassian
Saber Salehkaleybar
Surya Kanoria
Dale Schuurmans
95
3
0
30 Apr 2024
ROPO: Robust Preference Optimization for Large Language Models
ROPO: Robust Preference Optimization for Large Language Models
Xize Liang
Chao Chen
Shuang Qiu
Jie Wang
Yue-bo Wu
Zhihang Fu
Zhihao Shi
Feng Wu
Jieping Ye
86
3
0
05 Apr 2024
CURATRON: Complete Robust Preference Data for Robust Alignment of Large
  Language Models
CURATRON: Complete Robust Preference Data for Robust Alignment of Large Language Models
S. Nguyen
Uma-Naresh Niranjan
Theja Tulabandhula
88
0
0
05 Mar 2024
Active Preference Optimization for Sample Efficient RLHF
Active Preference Optimization for Sample Efficient RLHF
Nirjhar Das
Souradip Chakraborty
Aldo Pacchiano
Sayak Ray Chowdhury
160
22
0
16 Feb 2024
1