Optimal Policy Minimum Bayesian Risk

22 May 2025

Papers citing "Optimal Policy Minimum Bayesian Risk"

2 / 2 papers shown

Title
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Kanishk Gandhi Ayush Chakravarthy Anikait Singh Nathan Lile Noah D. Goodman ReLM LRM 212 111 0 03 Mar 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning DeepSeek-AI Daya Guo Dejian Yang Haowei Zhang Junxiao Song ... Shiyu Wang S. Yu Shunfeng Zhou Shuting Pan S.S. Li ReLM VLM OffRL AI4TS LRM 398 2,034 0 22 Jan 2025