Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2411.13082
Cited By
v1
v2
v3 (latest)
Patience Is The Key to Large Language Model Reasoning
20 November 2024
Yijiong Yu
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Patience Is The Key to Large Language Model Reasoning"
5 / 5 papers shown
Title
O1 Replication Journey: A Strategic Progress Report -- Part 1
Yiwei Qin
Xuefeng Li
Haoyang Zou
Yixiu Liu
Shijie Xia
...
Yixin Ye
Weizhe Yuan
Hector Liu
Yuezun Li
Pengfei Liu
VLM
89
91
0
08 Oct 2024
Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters
Charlie Snell
Jaehoon Lee
Kelvin Xu
Aviral Kumar
LRM
198
698
0
06 Aug 2024
Step-DPO: Step-wise Preference Optimization for Long-chain Reasoning of LLMs
Xin Lai
Zhuotao Tian
Yukang Chen
Senqiao Yang
Xiangru Peng
Jiaya Jia
LRM
153
126
0
26 Jun 2024
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
389
4,163
0
29 May 2023
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
353
4,598
0
27 Oct 2021
1