
Enhancing LLM Safety via Constrained Direct Preference Optimization
Papers citing "Enhancing LLM Safety via Constrained Direct Preference Optimization"
17 / 17 papers shown
Title |
---|
![]() Nash Learning from Human Feedback Rémi Munos Michal Valko Daniele Calandriello M. G. Azar Mark Rowland ...Nikola Momchev Olivier Bachem D. Mankowitz Doina Precup Bilal Piot |