Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2404.00530
Cited By
Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization
31 March 2024
Hritik Bansal
Ashima Suvarna
Gantavya Bhatt
Nanyun Peng
Kai-Wei Chang
Aditya Grover
ALM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Comparing Bad Apples to Good Oranges: Aligning Large Language Models via Joint Preference Optimization"
2 / 52 papers shown
Title
Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer
Colin Raffel
Noam M. Shazeer
Adam Roberts
Katherine Lee
Sharan Narang
Michael Matena
Yanqi Zhou
Wei Li
Peter J. Liu
AIMat
305
19,824
0
23 Oct 2019
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
245
18,685
0
20 Jul 2017
Previous
1
2