Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.17387
Cited By
Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models
24 February 2025
Alon Albalak
Duy Phung
Nathan Lile
Rafael Rafailov
Kanishk Gandhi
Louis Castricato
Anikait Singh
Chase Blagden
Violet Xiang
Dakota Mahan
Nick Haber
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Big-Math: A Large-Scale, High-Quality Math Dataset for Reinforcement Learning in Language Models"
3 / 3 papers shown
Title
Exploring the Potential of Offline RL for Reasoning in LLMs: A Preliminary Study
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
OffRL
LRM
34
0
0
04 May 2025
DeepDistill: Enhancing LLM Reasoning Capabilities via Large-Scale Difficulty-Graded Data Training
Xiaoyu Tian
Sitong Zhao
Haotian Wang
Shuaiting Chen
Yiping Peng
Yunjie Ji
Han Zhao
Xiangang Li
LRM
57
1
0
24 Apr 2025
NaturalReasoning: Reasoning in the Wild with 2.8M Challenging Questions
Weizhe Yuan
Jane Dwivedi-Yu
Song Jiang
Karthik Padthe
Yang Li
...
Ilia Kulikov
Kyunghyun Cho
Yuandong Tian
Jason Weston
Xian Li
ReLM
LRM
62
10
0
24 Feb 2025
1