Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2410.08067
Cited By
Reward-Augmented Data Enhances Direct Preference Alignment of LLMs
10 October 2024
Shenao Zhang
Zhihan Liu
Boyi Liu
Yuhang Zhang
Yingxiang Yang
Y. Liu
Liyu Chen
Tao Sun
Ziyi Wang
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Reward-Augmented Data Enhances Direct Preference Alignment of LLMs"
2 / 2 papers shown
Title
DSTC: Direct Preference Learning with Only Self-Generated Tests and Code to Improve Code LMs
Zhihan Liu
Shenao Zhang
Yongfei Liu
Boyi Liu
Yingxiang Yang
Zhaoran Wang
113
2
0
20 Nov 2024
Online Bandit Learning with Offline Preference Data for Improved RLHF
Akhil Agnihotri
Rahul Jain
Deepak Ramachandran
Zheng Wen
OffRL
37
2
0
13 Jun 2024
1