ResearchTrend.AI
  • Papers
  • Communities
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2502.13173
  4. Cited By
Thinking Preference Optimization

Thinking Preference Optimization

17 February 2025
Wang Yang
Hongye Jin
Jingfeng Yang
Vipin Chaudhary
Xiaotian Han
    ReLM
    LRM
ArXivPDFHTML

Papers citing "Thinking Preference Optimization"

3 / 3 papers shown
Title
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
AutoL2S: Auto Long-Short Reasoning for Efficient Large Language Models
Feng Luo
Yu-Neng Chuang
Guanchu Wang
Hoang Anh Duy Le
Shaochen Zhong
...
Jiayi Yuan
Yang Sui
Vladimir Braverman
Vipin Chaudhary
Xia Hu
LRM
41
1
0
28 May 2025
Learning to Reason under Off-Policy Guidance
Learning to Reason under Off-Policy Guidance
Jianhao Yan
Yafu Li
Zican Hu
Zhi Wang
Ganqu Cui
Xiaoye Qu
Yu Cheng
Yue Zhang
OffRL
LRM
61
8
0
21 Apr 2025
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Enhancing LLM Reasoning with Iterative DPO: A Comprehensive Empirical Investigation
Songjun Tu
Jiahao Lin
Xiangyu Tian
Qichao Zhang
Linjing Li
...
Nan Xu
Wei He
Xiangyuan Lan
D. Jiang
Dongbin Zhao
LRM
98
4
0
17 Mar 2025
1