v1v2 (latest)

Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF

13 December 2023

Papers citing "Distributional Preference Learning: Understanding and Accounting for Hidden Context in RLHF"

2 / 52 papers shown

Title
Crowd-PrefRL: Preference-Based Reward Learning from Crowds David Chhan Ellen R. Novoseller Vernon J. Lawhern 161 5 0 17 Jan 2024
A Minimaximalist Approach to Reinforcement Learning from Human Feedback Gokul Swamy Christoph Dann Rahul Kidambi Zhiwei Steven Wu Alekh Agarwal OffRL 132 112 0 08 Jan 2024