Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2407.12863
Cited By
Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models
12 July 2024
Jung Hyun Lee
June Yong Yang
Byeongho Heo
Dongyoon Han
Kang Min Yoo
Eunho Yang
Kang Min Yoo
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models"
3 / 3 papers shown
Title
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Zhe Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya-Qin Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRL
LRM
69
54
0
18 Mar 2025
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
320
3,273
0
21 Mar 2022
1