
Training language models to follow instructions with human feedback
Papers citing "Training language models to follow instructions with human feedback"
50 / 6,381 papers shown
Title |
---|
![]() Controllable Preference Optimization: Toward Controllable
Multi-Objective Alignment Yiju Guo Ganqu Cui Lifan Yuan Ning Ding Jiexin Wang ...Ruobing Xie Jie Zhou Yankai Lin Zhiyuan Liu Maosong Sun |