ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2407.12863
  4. Cited By
Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models

Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models

12 July 2024
Jung Hyun Lee
June Yong Yang
Byeongho Heo
Dongyoon Han
Kang Min Yoo
Eunho Yang
Kang Min Yoo
    LRM
ArXivPDFHTML

Papers citing "Token-Supervised Value Models for Enhancing Mathematical Problem-Solving Capabilities of Large Language Models"

3 / 3 papers shown
Title
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Zhe Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya-Qin Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRL
LRM
69
54
0
18 Mar 2025
Large Language Models are Zero-Shot Reasoners
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
320
3,273
0
21 Mar 2022
1