Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2504.00424
Cited By
Hawkeye:Efficient Reasoning with Model Collaboration
1 April 2025
Jianshu She
Z. Li
Zhemin Huang
Qi Li
Peiran Xu
Haonan Li
Qirong Ho
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Hawkeye:Efficient Reasoning with Model Collaboration"
16 / 16 papers shown
Title
HAPO: Training Language Models to Reason Concisely via History-Aware Policy Optimization
Chengyu Huang
Zhengxin Zhang
Claire Cardie
LRM
118
0
0
16 May 2025
A Survey on Collaborative Mechanisms Between Large and Small Language Models
Yi Chen
JiaHao Zhao
HaoHao Han
86
1
0
12 May 2025
S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
Wenyuan Zhang
Jiawei Sheng
Xinghua Zhang
Zefeng Zhang
Tingwen Liu
ELM
LRM
108
5
0
14 Apr 2025
Chain of Draft: Thinking Faster by Writing Less
Silei Xu
Wenhao Xie
Lingxiao Zhao
Pengcheng He
AI4TS
LRM
170
85
0
25 Feb 2025
CoT-Valve: Length-Compressible Chain-of-Thought Tuning
Xinyin Ma
Guangnian Wan
Runpeng Yu
Gongfan Fang
Xinchao Wang
LRM
156
55
0
13 Feb 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
380
2,000
0
22 Jan 2025
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague
Fangcong Yin
Juan Diego Rodriguez
Dongwei Jiang
Manya Wadhwa
Prasann Singhal
Xinyu Zhao
Xi Ye
Kyle Mahowald
Greg Durrett
ReLM
LRM
224
131
0
18 Sep 2024
DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models
Zhihong Shao
Peiyi Wang
Qihao Zhu
Runxin Xu
Jun-Mei Song
...
Haowei Zhang
Mingchuan Zhang
Yiming Li
Yu-Huan Wu
Daya Guo
ReLM
LRM
169
1,288
0
05 Feb 2024
Graph of Thoughts: Solving Elaborate Problems with Large Language Models
Maciej Besta
Nils Blach
Aleš Kubíček
Robert Gerstenberger
Michal Podstawski
...
Joanna Gajda
Tomasz Lehmann
H. Niewiadomski
Piotr Nyczyk
Torsten Hoefler
LRM
AI4CE
LM&Ro
169
707
0
18 Aug 2023
Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step
Liunian Harold Li
Jack Hessel
Youngjae Yu
Xiang Ren
Kai-Wei Chang
Yejin Choi
LRM
AI4CE
ReLM
100
143
0
24 Jun 2023
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
Rafael Rafailov
Archit Sharma
E. Mitchell
Stefano Ermon
Christopher D. Manning
Chelsea Finn
ALM
389
4,163
0
29 May 2023
Automatic Chain of Thought Prompting in Large Language Models
Zhuosheng Zhang
Aston Zhang
Mu Li
Alexander J. Smola
ReLM
LRM
153
635
0
07 Oct 2022
ReAct: Synergizing Reasoning and Acting in Language Models
Shunyu Yao
Jeffrey Zhao
Dian Yu
Nan Du
Izhak Shafran
Karthik Narasimhan
Yuan Cao
LLMAG
ReLM
LRM
450
2,982
0
06 Oct 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
853
9,714
0
28 Jan 2022
Training Verifiers to Solve Math Word Problems
K. Cobbe
V. Kosaraju
Mohammad Bavarian
Mark Chen
Heewoo Jun
...
Jerry Tworek
Jacob Hilton
Reiichiro Nakano
Christopher Hesse
John Schulman
ReLM
OffRL
LRM
359
4,598
0
27 Oct 2021
Proximal Policy Optimization Algorithms
John Schulman
Filip Wolski
Prafulla Dhariwal
Alec Radford
Oleg Klimov
OffRL
571
19,296
0
20 Jul 2017
1