Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.14768
Cited By
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
21 February 2025
Tian Xie
Zitian Gao
Qingnan Ren
Haoming Luo
Yuqian Hong
Bryan Dai
Joey Zhou
Kai Qiu
Zhirong Wu
Chong Luo
ReLM
OffRL
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning"
40 / 40 papers shown
Title
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
Junteng Liu
Yuanxiang Fan
Z. L. Jiang
Han Ding
Yongyi Hu
...
Yunan Huang
Mozhi Zhang
Pengyu Zhao
Junjie Yan
Junxian He
OffRL
NAI
SyDa
LRM
ELM
9
1
0
26 May 2025
Interleaved Reasoning for Large Language Models via Reinforcement Learning
Roy Xie
David Qiu
Deepak Gopinath
Dong Lin
Yanchao Sun
Chong-Jun Wang
Saloni Potdar
Bhuwan Dhingra
KELM
LRM
49
0
0
26 May 2025
Enigmata: Scaling Logical Reasoning in Large Language Models with Synthetic Verifiable Puzzles
Jiangjie Chen
Qianyu He
Siyu Yuan
Aili Chen
Zhicheng Cai
...
Qiying Yu
Xuefeng Li
Jiaze Chen
Hao Zhou
Mingxuan Wang
ReLM
LRM
11
1
0
26 May 2025
Token-Importance Guided Direct Preference Optimization
Yang Ning
Lin Hai
Liu Yibo
Tian Baoliang
Liu Guoqing
Zhang Haijun
34
0
0
26 May 2025
Outcome-based Reinforcement Learning to Predict the Future
Benjamin Turtel
Danny Franklin
Kris Skotheim
Luke Hewitt
Philipp Schoenegger
OffRL
AI4TS
23
0
0
23 May 2025
Towards Revealing the Effectiveness of Small-Scale Fine-tuning in R1-style Reinforcement Learning
Yutong Chen
Jiandong Gao
Ji Wu
ALM
86
0
0
23 May 2025
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Fanqi Wan
Weizhou Shen
Shengyi Liao
Yingcheng Shi
Chenliang Li
Ziyi Yang
Ji Zhang
Fei Huang
Jingren Zhou
Ming Yan
OffRL
LLMAG
ReLM
LRM
30
0
0
23 May 2025
ProgRM: Build Better GUI Agents with Progress Rewards
Danyang Zhang
Situo Zhang
Ziyue Yang
Zichen Zhu
Zihan Zhao
Ruisheng Cao
Lu Chen
Kai Yu
41
0
0
23 May 2025
Divide-Fuse-Conquer: Eliciting "Aha Moments" in Multi-Scenario Games
Xiaoqing Zhang
Huabin Zheng
Ang Lv
Yuhan Liu
Zirui Song
Flood Sung
Xiuying Chen
Rui Yan
OffRL
ReLM
LRM
AI4CE
43
0
0
22 May 2025
Select2Reason: Efficient Instruction-Tuning Data Selection for Long-CoT Reasoning
Cehao Yang
Xueyuan Lin
Chengjin Xu
Xuhui Jiang
Xiaojun Wu
Honghao Liu
Hui Xiong
Jian Guo
LRM
29
0
0
22 May 2025
LLM Access Shield: Domain-Specific LLM Framework for Privacy Policy Compliance
Yu Wang
Cailing Cai
Zhihua Xiao
Peifung E. Lam
38
0
0
22 May 2025
SATURN: SAT-based Reinforcement Learning to Unleash Language Model Reasoning
Huanyu Liu
Jia Li
Hao Zhu
Kechi Zhang
Yihong Dong
Ge Li
OffRL
ReLM
LRM
32
0
0
22 May 2025
R
2
ec
\text{R}^2\text{ec}
R
2
ec
: Towards Large Recommender Models with Reasoning
Runyang You
Chak Tou Leong
Xinyu Lin
Xin Zhang
Wenjie Wang
Wenjie Li
Liqiang Nie
LRM
43
0
0
22 May 2025
Trajectory Bellman Residual Minimization: A Simple Value-Based Method for LLM Reasoning
Yurun Yuan
Fan Chen
Zeyu Jia
Alexander Rakhlin
Tengyang Xie
OffRL
79
1
0
21 May 2025
An Empirical Study on Reinforcement Learning for Reasoning-Search Interleaved LLM Agents
Bowen Jin
Jinsung Yoon
Priyanka Kargupta
Sercan O. Arik
Jiawei Han
LRM
65
1
0
21 May 2025
Training Step-Level Reasoning Verifiers with Formal Verification Tools
Ryo Kamoi
Yusen Zhang
Nan Zhang
Sarkar Snigdha Sarathi Das
Rui Zhang
OffRL
LRM
15
0
0
21 May 2025
ToTRL: Unlock LLM Tree-of-Thoughts Reasoning Potential through Puzzles Solving
Haoyuan Wu
Xueyi Chen
Rui Ming
Jilong Gao
Shoubo Hu
Zhuolun He
Bei Yu
LRM
55
0
0
19 May 2025
Observe-R1: Unlocking Reasoning Abilities of MLLMs with Dynamic Progressive Reinforcement Learning
Zirun Guo
Minjie Hong
Tao Jin
OffRL
LRM
69
0
0
18 May 2025
VeriReason: Reinforcement Learning with Testbench Feedback for Reasoning-Enhanced Verilog Generation
Yiting Wang
Guoheng Sun
Wanghao Ye
Gang Qu
Ang Li
OffRL
3DV
LRM
VLM
56
0
0
17 May 2025
Time-R1: Towards Comprehensive Temporal Reasoning in LLMs
Zijia Liu
Peixuan Han
Haofei Yu
Haoru Li
Jiaxuan You
AI4TS
LRM
49
0
0
16 May 2025
Beyond Áha!': Toward Systematic Meta-Abilities Alignment in Large Reasoning Models
Zhiyuan Hu
Yansen Wang
Hanze Dong
Yuhui Xu
Amrita Saha
Caiming Xiong
Bryan Hooi
Junnan Li
LRM
46
2
0
15 May 2025
R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning
Yi-Fan Zhang
Xingyu Lu
X. Hu
Chaoyou Fu
Bin Wen
...
Jianfei Chen
Fan Yang
Zheng Zhang
Yan Li
Liang Wang
OffRL
LRM
66
4
0
05 May 2025
Between Underthinking and Overthinking: An Empirical Study of Reasoning Length and correctness in LLMs
Jinyan Su
Jennifer Healey
Preslav Nakov
Claire Cardie
LRM
227
9
0
30 Apr 2025
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Xiaochen Li
Jiajie Jin
Guanting Dong
Hongjin Qian
Yutao Zhu
Yongkang Wu
Ji-Rong Wen
Zhicheng Dou
LLMAG
LRM
129
8
0
30 Apr 2025
SARI: Structured Audio Reasoning via Curriculum-Guided Reinforcement Learning
Cheng Wen
Tingwei Guo
Shuaijiang Zhao
Wei Zou
Xiangang Li
OffRL
AuLLM
LRM
75
5
0
22 Apr 2025
Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods
Junlin Wang
Shang Zhu
Jon Saad-Falcon
Ben Athiwaratkun
Qingyang Wu
Jue Wang
Shuaiwen Leon Song
Ce Zhang
Bhuwan Dhingra
James Y. Zou
LRM
70
8
0
18 Apr 2025
Nemotron-CrossThink: Scaling Self-Learning beyond Math Reasoning
Syeda Nahida Akter
Shrimai Prabhumoye
Matvei Novikov
Seungju Han
Ying Lin
...
Eric Nyberg
Yejin Choi
M. Patwary
Mohammad Shoeybi
Bryan Catanzaro
ReLM
OffRL
LRM
355
2
1
15 Apr 2025
DUMP: Automated Distribution-Level Curriculum Learning for RL-based LLM Post-training
Zhenting Wang
Guofeng Cui
Kun Wan
Wentian Zhao
40
1
0
13 Apr 2025
A Sober Look at Progress in Language Model Reasoning: Pitfalls and Paths to Reproducibility
Andreas Hochlehnert
Hardik Bhatnagar
Vishaal Udandarao
Samuel Albanie
Ameya Prabhu
Matthias Bethge
ReLM
ALM
LRM
131
10
0
09 Apr 2025
Concise Reasoning via Reinforcement Learning
Mehdi Fatemi
Banafsheh Rafiee
Mingjie Tang
Kartik Talamadupula
ReLM
OffRL
LRM
73
11
0
07 Apr 2025
Trust Region Preference Approximation: A simple and stable reinforcement learning algorithm for LLM reasoning
Xuerui Su
Shufang Xie
Guoqing Liu
Yingce Xia
Renqian Luo
Peiran Jin
Zhiming Ma
Yue Wang
Zun Wang
Yuting Liu
LRM
48
3
0
06 Apr 2025
Do Theory of Mind Benchmarks Need Explicit Human-like Reasoning in Language Models?
Yi-Long Lu
Chunhui Zhang
Jiajun Song
Lifeng Fan
Wei Wang
OffRL
74
0
0
02 Apr 2025
Learning to Reason for Long-Form Story Generation
Alexander Gurung
Mirella Lapata
ReLM
OffRL
LRM
72
3
0
28 Mar 2025
Video-R1: Reinforcing Video Reasoning in MLLMs
Kaituo Feng
Kaixiong Gong
Yangqiu Song
Zonghao Guo
Yibing Wang
Tianshuo Peng
Jian Wu
Xiaoying Zhang
Benyou Wang
Xiangyu Yue
AI4TS
SyDa
LRM
73
31
0
27 Mar 2025
One Framework to Rule Them All: Unifying RL-Based and RL-Free Methods in RLHF
Xin Cai
57
1
0
25 Mar 2025
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild
Weihao Zeng
Yuzhen Huang
Qian Liu
Wei Liu
Keqing He
Zejun Ma
Junxian He
OffRL
ReLM
LRM
109
85
0
24 Mar 2025
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Songjun Tu
Jiahao Lin
Xiangyu Tian
Qichao Zhang
Linjing Li
...
Nan Xu
Wei He
Xiangyuan Lan
D. Jiang
Dongbin Zhao
LRM
94
4
0
17 Mar 2025
RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning
Jerry Huang
Siddarth Madala
Risham Sidhu
Cheng Niu
Hao Peng
Julia Hockenmaier
Tong Zhang
LRM
RALM
125
4
0
17 Mar 2025
ReMA: Learning to Meta-think for LLMs with Multi-Agent Reinforcement Learning
Bo Liu
Yunxiang Li
Yangqiu Song
Hanjing Wang
Linyi Yang
...
Jun Wang
Jun Wang
Weinan Zhang
Shuyue Hu
Ying Wen
LLMAG
KELM
LRM
AI4CE
113
10
0
12 Mar 2025
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Bowen Jin
Hansi Zeng
Zhenrui Yue
Dong Wang
Sercan O. Arik
Dong Wang
Hamed Zamani
Jiawei Han
RALM
ReLM
KELM
OffRL
AI4TS
LRM
120
77
0
12 Mar 2025
1