ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2011.02511
  4. Cited By
Offline Reinforcement Learning from Human Feedback in Real-World
  Sequence-to-Sequence Tasks

Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks

4 November 2020
Julia Kreutzer
Stefan Riezler
Carolin (Haas) Lawrence
    RALM
    OffRL
ArXivPDFHTML

Papers citing "Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks"

5 / 5 papers shown
Title
Reinforcement Learning for Generative AI: A Survey
Reinforcement Learning for Generative AI: A Survey
Yuanjiang Cao
Quan.Z Sheng
Julian McAuley
Lina Yao
SyDa
95
11
0
28 Aug 2023
Fine-Tuning Language Models from Human Preferences
Fine-Tuning Language Models from Human Preferences
Daniel M. Ziegler
Nisan Stiennon
Jeff Wu
Tom B. Brown
Alec Radford
Dario Amodei
Paul Christiano
G. Irving
ALM
422
1,664
0
18 Sep 2019
On the Weaknesses of Reinforcement Learning for Neural Machine
  Translation
On the Weaknesses of Reinforcement Learning for Neural Machine Translation
Leshem Choshen
Lior Fox
Zohar Aizenbud
Omri Abend
64
106
0
03 Jul 2019
Challenges of Real-World Reinforcement Learning
Challenges of Real-World Reinforcement Learning
Gabriel Dulac-Arnold
D. Mankowitz
Todd Hester
OffRL
69
545
0
29 Apr 2019
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Data-Efficient Off-Policy Policy Evaluation for Reinforcement Learning
Philip S. Thomas
Emma Brunskill
OffRL
225
573
0
04 Apr 2016
1