Training Small Reasoning LLMs with Cognitive Preference Alignment

Training Small Reasoning LLMs with Cognitive Preference Alignment

Papers citing "Training Small Reasoning LLMs with Cognitive Preference Alignment"

11 / 11 papers shown
Title
KTO: Model Alignment as Prospect Theoretic Optimization
KTO: Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh
Winnie Xu
Niklas Muennighoff
Dan Jurafsky
Douwe Kiela
238
532
0
02 Feb 2024