Fine-Tuning Language Models from Human Preferences
v1v2 (latest)

Fine-Tuning Language Models from Human Preferences

    ALM

Papers citing "Fine-Tuning Language Models from Human Preferences"

50 / 1,265 papers shown
Title
Intermediate direct preference optimization
Intermediate direct preference optimization
Atsushi Kojima
48
0
0
06 Aug 2024