Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2212.10114
Cited By
True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4
20 December 2022
Maksym Del
Mark Fishel
RALM
ELM
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"True Detective: A Deep Abductive Reasoning Benchmark Undoable for GPT-3 and Challenging for GPT-4"
7 / 7 papers shown
Title
Evaluating the Logical Reasoning Abilities of Large Reasoning Models
Hanmeng Liu
Yiran Ding
Zhizhang Fu
Chaoli Zhang
Xiaozhang Liu
Yue Zhang
ELM
LRM
7
0
0
17 May 2025
Toward Generalizable Evaluation in the LLM Era: A Survey Beyond Benchmarks
Yixin Cao
Shibo Hong
Xinfeng Li
Jiahao Ying
Yubo Ma
...
Juanzi Li
Aixin Sun
Xuanjing Huang
Tat-Seng Chua
Tianwei Zhang
ALM
ELM
96
2
0
26 Apr 2025
LogiDynamics: Unraveling the Dynamics of Logical Inference in Large Language Model Reasoning
Tianshi Zheng
Jiayang Cheng
Chunyang Li
Haochen Shi
Zhendong Wang
Jiaxin Bai
Yangqiu Song
Ginny Wong
Simon See
LRM
46
3
0
16 Feb 2025
Puzzle Solving using Reasoning of Large Language Models: A Survey
Panagiotis Giadikiaroglou
Maria Lymperaiou
Giorgos Filandrianos
Giorgos Stamou
ELM
ReLM
LRM
19
27
0
17 Feb 2024
Large Language Models are Zero-Shot Reasoners
Takeshi Kojima
S. Gu
Machel Reid
Yutaka Matsuo
Yusuke Iwasawa
ReLM
LRM
328
4,077
0
24 May 2022
Training language models to follow instructions with human feedback
Long Ouyang
Jeff Wu
Xu Jiang
Diogo Almeida
Carroll L. Wainwright
...
Amanda Askell
Peter Welinder
Paul Christiano
Jan Leike
Ryan J. Lowe
OSLM
ALM
372
12,081
0
04 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
447
8,650
0
28 Jan 2022
1