Enhancing LLM Safety via Constrained Direct Preference Optimization

Enhancing LLM Safety via Constrained Direct Preference Optimization

Papers citing "Enhancing LLM Safety via Constrained Direct Preference Optimization"

17 / 17 papers shown
Title
Nash Learning from Human Feedback
Nash Learning from Human Feedback
Rémi Munos
Michal Valko
Daniele Calandriello
M. G. Azar
Mark Rowland
...
Nikola Momchev
Olivier Bachem
D. Mankowitz
Doina Precup
Bilal Piot
118
147
0
01 Dec 2023

We use cookies and other tracking technologies to improve your browsing experience on our website, to show you personalized content and targeted ads, to analyze our website traffic, and to understand where our visitors are coming from. See our policy.