Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2505.23575
Cited By
CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring
29 May 2025
Benjamin Arnav
Pablo Bernabeu-Pérez
Nathan Helm-Burger
Tim Kostolansky
Hannes Whittingham
Mary Phuong
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"CoT Red-Handed: Stress Testing Chain-of-Thought Monitoring"
14 / 14 papers shown
Title
Reasoning Models Don't Always Say What They Think
Yanda Chen
Joe Benton
Ansh Radhakrishnan
Jonathan Uesato
Carson E. Denison
...
Vlad Mikulik
Samuel R. Bowman
Jan Leike
Jared Kaplan
E. Perez
ReLM
LRM
142
49
1
08 May 2025
OpenCodeReasoning: Advancing Data Distillation for Competitive Coding
Wasi Uddin Ahmad
Mehrzad Samadi
Somshubra Majumdar
Aleksander Ficek
Siddhartha Jain
Jocelyn Huang
Vahid Noroozi
Boris Ginsburg
LRM
128
13
0
02 Apr 2025
Measuring AI Ability to Complete Long Tasks
Thomas Kwa
Ben West
Joel Becker
Amy Deng
Katharyn Garcia
...
Lucas Jun Koba Sato
H. Wijk
Daniel M. Ziegler
Elizabeth Barnes
Lawrence Chan
ELM
248
18
0
18 Mar 2025
Monitoring Reasoning Models for Misbehavior and the Risks of Promoting Obfuscation
Bowen Baker
Joost Huizinga
Leo Gao
Zehao Dou
M. Guan
Aleksander Mądry
Wojciech Zaremba
J. Pachocki
David Farhi
LRM
167
38
0
14 Mar 2025
Chain-of-Thought Reasoning In The Wild Is Not Always Faithful
Iván Arcuschin
Jett Janiak
Robert Krzyzanowski
Senthooran Rajamanoharan
Neel Nanda
Arthur Conmy
ReLM
LRM
155
20
0
11 Mar 2025
A sketch of an AI control safety case
Tomek Korbak
Joshua Clymer
Benjamin Hilton
Buck Shlegeris
Geoffrey Irving
147
10
0
28 Jan 2025
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
DeepSeek-AI
Daya Guo
Dejian Yang
Haowei Zhang
Junxiao Song
...
Shiyu Wang
S. Yu
Shunfeng Zhou
Shuting Pan
S.S. Li
ReLM
VLM
OffRL
AI4TS
LRM
380
2,000
0
22 Jan 2025
Scheherazade: Evaluating Chain-of-Thought Math Reasoning in LLMs with Chain-of-Problems
Stephen Miner
Yoshiki Takashima
Simeng Han
Ferhat Erata
Timos Antonopoulos
R. Piskac
Scott J. Shapiro
LRM
142
4
0
30 Sep 2024
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo
Minh Chien Vu
Jenny Chim
Han Hu
Wenhao Yu
...
David Lo
Daniel Fried
Xiaoning Du
H. D. Vries
Leandro von Werra
174
192
0
22 Jun 2024
Faithful Logical Reasoning via Symbolic Chain-of-Thought
Jundong Xu
Hao Fei
Liangming Pan
Qian Liu
Mong Li Lee
Wynne Hsu
OffRL
LRM
LLMAG
140
65
0
28 May 2024
Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning
Debjit Paul
Robert West
Antoine Bosselut
Boi Faltings
ReLM
LRM
115
28
0
21 Feb 2024
Measuring Faithfulness in Chain-of-Thought Reasoning
Tamera Lanham
Anna Chen
Ansh Radhakrishnan
Benoit Steiner
Carson E. Denison
...
Zac Hatfield-Dodds
Jared Kaplan
J. Brauner
Sam Bowman
Ethan Perez
ReLM
LRM
72
193
0
17 Jul 2023
Faithful Chain-of-Thought Reasoning
Qing Lyu
Shreya Havaldar
Adam Stein
Li Zhang
D. Rao
Eric Wong
Marianna Apidianaki
Chris Callison-Burch
ReLM
LRM
119
228
0
31 Jan 2023
Show Your Work: Scratchpads for Intermediate Computation with Language Models
Maxwell Nye
Anders Andreassen
Guy Gur-Ari
Henryk Michalewski
Jacob Austin
...
Aitor Lewkowycz
Maarten Bosma
D. Luan
Charles Sutton
Augustus Odena
ReLM
LRM
183
756
0
30 Nov 2021
1