Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.04451
Cited By
Hindsight Preference Learning for Offline Preference-based Reinforcement Learning
5 July 2024
Chen-Xiao Gao
Shengjun Fang
Chenjun Xiao
Yang Yu
Zongzhang Zhang
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Hindsight Preference Learning for Offline Preference-based Reinforcement Learning"
5 / 5 papers shown
Title
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
Katherine Metcalf
48
5
0
12 Apr 2024
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
333
12,003
0
04 Mar 2022
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
214
843
0
12 Oct 2021
Reward (Mis)design for Autonomous Driving
W. B. Knox
A. Allievi
Holger Banzhaf
Felix Schmitt
Peter Stone
83
113
0
28 Apr 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
1