Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2501.04519
Cited By
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
8 January 2025
Xinyu Guan
Lefei Zhang
Yifei Liu
Ning Shang
Youran Sun
Yi Zhu
Fan Yang
Mao Yang
LRM
SyDa
ReLM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking"
13 / 63 papers shown
Title
Dyve: Thinking Fast and Slow for Dynamic Process Verification
Qiang Xu
Feiyu Xiong
Zhijian Xu
Xiangyu Wen
Qiang Xu
LRM
40
3
0
16 Feb 2025
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
Xinyin Ma
Guangnian Wan
Runpeng Yu
Gongfan Fang
Xinchao Wang
LRM
86
27
0
13 Feb 2025
Typhoon T1: An Open Thai Reasoning Model
Pittawat Taveekitworachai
Potsawee Manakul
Kasima Tharnpipitchai
Kunat Pipatanakul
OffRL
LRM
102
0
0
13 Feb 2025
ReasonFlux: Hierarchical LLM Reasoning via Scaling Thought Templates
L. Yang
Zhaochen Yu
Tengjiao Wang
Mengdi Wang
ReLM
LRM
AI4CE
101
12
0
10 Feb 2025
PIPA: Preference Alignment as Prior-Informed Statistical Estimation
Junbo Li
Zhangyang Wang
Qiang Liu
OffRL
106
0
0
09 Feb 2025
Iterative Deepening Sampling for Large Language Models
Weizhe Chen
Sven Koenig
B. Dilkina
LRM
ReLM
88
1
0
08 Feb 2025
Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment
Haoyu Wang
Zeyu Qin
Li Shen
Xueqian Wang
Minhao Cheng
Dacheng Tao
99
2
0
06 Feb 2025
Brief analysis of DeepSeek R1 and its implications for Generative AI
Sarah Mercer
Samuel Spillard
Daniel P. Martin
79
13
0
04 Feb 2025
Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search
Maohao Shen
Guangtao Zeng
Zhenting Qi
Zhang-Wei Hong
Zhenfang Chen
Wei Lu
G. Wornell
Subhro Das
David D. Cox
Chuang Gan
LLMAG
LRM
237
7
0
04 Feb 2025
A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods
Isha Puri
Shivchander Sudalairaj
Guangxuan Xu
Kai Xu
Akash Srivastava
LRM
78
4
0
03 Feb 2025
Breaking Focus: Contextual Distraction Curse in Large Language Models
Yue Huang
Yanbo Wang
Zixiang Xu
Chujie Gao
Siyuan Wu
Jiayi Ye
Preslav Nakov
Pin-Yu Chen
Jiahui Geng
AAML
48
2
0
03 Feb 2025
Spend Wisely: Maximizing Post-Training Gains in Iterative Synthetic Data Boostrapping
Pu Yang
Yunzhen Feng
Ziyuan Chen
Yuhang Wu
Zhuoyuan Li
DiffM
106
0
0
31 Jan 2025
Chain-of-Retrieval Augmented Generation
Liang Wang
Haonan Chen
Nan Yang
Xiaolong Huang
Zhicheng Dou
Furu Wei
RALM
LRM
ReLM
3DV
88
6
0
24 Jan 2025
Previous
1
2