Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2502.13173
Cited By
Thinking Preference Optimization
17 February 2025
Wang Yang
Hongye Jin
Jingfeng Yang
Vipin Chaudhary
Xiaotian Han
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"Thinking Preference Optimization"
3 / 3 papers shown
Title
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
Feng Luo
Yu-Neng Chuang
Guanchu Wang
Hoang Anh Duy Le
Shaochen Zhong
...
Jiayi Yuan
Yang Sui
Vladimir Braverman
Vipin Chaudhary
Xia Hu
LRM
41
1
0
28 May 2025
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
61
8
0
21 Apr 2025
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Songjun Tu
Jiahao Lin
Xiangyu Tian
Qichao Zhang
Linjing Li
...
Nan Xu
Wei He
Xiangyuan Lan
D. Jiang
Dongbin Zhao
LRM
98
4
0
17 Mar 2025
1