The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization

2 November 2023

Papers citing "The Impact of Preference Agreement in Reinforcement Learning from Human Feedback: A Case Study in Summarization"

7 / 7 papers shown

Title
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback Yuntao Bai Andy Jones Kamal Ndousse Amanda Askell Anna Chen ... Jack Clark Sam McCandlish C. Olah Benjamin Mann Jared Kaplan 231 2,532 0 12 Apr 2022
Training language models to follow instructions with human feedback Long Ouyang Jeff Wu Xu Jiang Diogo Almeida Carroll L. Wainwright ... Amanda Askell Peter Welinder Paul Christiano Jan Leike Ryan J. Lowe OSLM ALM 739 12,815 0 04 Mar 2022
Dealing with Disagreements: Looking Beyond the Majority Vote in Subjective Annotations Aida Mostafazadeh Davani Mark Díaz Vinodkumar Prabhakaran 39 314 0 12 Oct 2021
Learning to summarize from human feedback Nisan Stiennon Long Ouyang Jeff Wu Daniel M. Ziegler Ryan J. Lowe Chelsea Voss Alec Radford Dario Amodei Paul Christiano ALM 198 2,120 0 02 Sep 2020
SummEval: Re-evaluating Summarization Evaluation Alexander R. Fabbri Wojciech Kry'sciñski Bryan McCann Caiming Xiong R. Socher Dragomir R. Radev HILM 90 709 0 24 Jul 2020
Deep reinforcement learning from human preferences Paul Christiano Jan Leike Tom B. Brown Miljan Martic Shane Legg Dario Amodei 112 3,282 0 12 Jun 2017
Statistical modality tagging from rule-based annotations and crowdsourcing Vinodkumar Prabhakaran Michael Bloodgood Mona T. Diab Bonnie J. Dorr Lori S. Levin C. Piatko Owen Rambow Benjamin Van Durme 32 28 0 04 Mar 2015