ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.04451
  4. Cited By
Hindsight Preference Learning for Offline Preference-based Reinforcement
  Learning

Hindsight Preference Learning for Offline Preference-based Reinforcement Learning

5 July 2024
Chen-Xiao Gao
Shengjun Fang
Chenjun Xiao
Yang Yu
Zongzhang Zhang
    OffRL
ArXivPDFHTML

Papers citing "Hindsight Preference Learning for Offline Preference-based Reinforcement Learning"

5 / 5 papers shown
Title
Hindsight PRIORs for Reward Learning from Human Preferences
Hindsight PRIORs for Reward Learning from Human Preferences
Mudit Verma
Katherine Metcalf
48
5
0
12 Apr 2024
Training language models to follow instructions with human feedback
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
333
12,003
0
04 Mar 2022
Offline Reinforcement Learning with Implicit Q-Learning
Offline Reinforcement Learning with Implicit Q-Learning
Ilya Kostrikov
Ashvin Nair
Sergey Levine
OffRL
214
843
0
12 Oct 2021
Reward (Mis)design for Autonomous Driving
Reward (Mis)design for Autonomous Driving
W. B. Knox
A. Allievi
Holger Banzhaf
Felix Schmitt
Peter Stone
83
113
0
28 Apr 2021
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on
  Open Problems
Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems
Sergey Levine
Aviral Kumar
George Tucker
Justin Fu
OffRL
GP
340
1,960
0
04 May 2020
1