
Faster WIND: Accelerating Iterative Best-of- Distillation for LLM Alignment
Papers citing "Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment"
45 / 45 papers shown
Title |
---|
![]() Nash Learning from Human Feedback Rémi Munos Michal Valko Daniele Calandriello M. G. Azar Mark Rowland ...Nikola Momchev Olivier Bachem D. Mankowitz Doina Precup Bilal Piot |