Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization

31 March 2024

Papers citing "Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization"

2 / 52 papers shown

Title
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer Colin Raffel Noam M. Shazeer Adam Roberts Katherine Lee Sharan Narang Michael Matena Yanqi Zhou Wei Li Peter J. Liu AIMat 305 19,824 0 23 Oct 2019
Proximal Policy Optimization Algorithms John Schulman Filip Wolski Prafulla Dhariwal Alec Radford Oleg Klimov OffRL 245 18,685 0 20 Jul 2017