Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2011.02511
Cited By
Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks
4 November 2020
Julia Kreutzer
Stefan Riezler
Carolin (Haas) Lawrence
RALM
OffRL
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks"
5 / 5 papers shown
Title
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
95
11
0
28 Aug 2023
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
422
1,664
0
18 Sep 2019
On the Weaknesses of Reinforcement Learning for Neural Machine Translation
Leshem Choshen
Lior Fox
Zohar Aizenbud
Omri Abend
64
106
0
03 Jul 2019
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
69
545
0
29 Apr 2019
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
1