Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.00436
Cited By
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
1 August 2023
Ning Miao
Yee Whye Teh
Tom Rainforth
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning"
38 / 88 papers shown
Title
Evaluating LLMs at Detecting Errors in LLM Responses
Ryo Kamoi
Sarkar Snigdha Sarathi Das
Renze Lou
Jihyun Janice Ahn
Yilun Zhao
...
Salika Dave
Shaobo Qin
Arman Cohan
Wenpeng Yin
Rui Zhang
44
20
0
04 Apr 2024
m&m's: A Benchmark to Evaluate Tool-Use for multi-step multi-modal Tasks
Zixian Ma
Weikai Huang
Jieyu Zhang
Tanmay Gupta
Ranjay Krishna
55
18
0
17 Mar 2024
Think Twice Before Trusting: Self-Detection for Large Language Models through Comprehensive Answer Reflection
Moxin Li
Wenjie Wang
Fuli Feng
Fengbin Zhu
Qifan Wang
Tat-Seng Chua
HILM
LRM
46
14
0
15 Mar 2024
Look Before You Leap: Problem Elaboration Prompting Improves Mathematical Reasoning in Large Language Models
Haoran Liao
Jidong Tian
Shaohua Hu
Hao He
Yaohui Jin
ReLM
LRM
46
1
0
24 Feb 2024
How Do Humans Write Code? Large Models Do It the Same Way Too
Long Li
Xuzheng He
LRM
43
4
0
24 Feb 2024
RefuteBench: Evaluating Refuting Instruction-Following for Large Language Models
Jianhao Yan
Yun Luo
Yue Zhang
ALM
LRM
38
7
0
21 Feb 2024
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Che Zhang
Zhenyang Xiao
Chengcheng Han
Yixin Lian
Yuejian Fang
LRM
33
0
0
20 Feb 2024
AutoPRM: Automating Procedural Supervision for Multi-Step Reasoning via Controllable Question Decomposition
Zhaorun Chen
Zhuokai Zhao
Zhihong Zhu
Ruiqi Zhang
Xiang Li
Bhiksha Raj
Huaxiu Yao
LRM
30
25
0
18 Feb 2024
Are Machines Better at Complex Reasoning? Unveiling Human-Machine Inference Gaps in Entailment Verification
Soumya Sanyal
Tianyi Xiao
Jiacheng Liu
Wenya Wang
Xiang Ren
LRM
ReLM
49
12
0
06 Feb 2024
Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies
Zhixuan Chu
Yan Wang
Feng Zhu
Lu Yu
Longfei Li
Jinjie Gu
LLMAG
23
8
0
06 Feb 2024
Deductive Beam Search: Decoding Deducible Rationale for Chain-of-Thought Reasoning
Tinghui Zhu
Kai Zhang
Jian Xie
Yu-Chuan Su
LRM
28
15
0
31 Jan 2024
Towards Goal-oriented Prompt Engineering for Large Language Models: A Survey
Haochen Li
Jonathan Leung
Zhiqi Shen
LM&MA
LLMAG
LRM
23
0
0
25 Jan 2024
Investigate-Consolidate-Exploit: A General Strategy for Inter-Task Agent Self-Evolution
Cheng Qian
Shihao Liang
Yujia Qin
Yining Ye
Xin Cong
Yankai Lin
Yesai Wu
Zhiyuan Liu
Maosong Sun
LLMAG
24
12
0
25 Jan 2024
ReFT: Reasoning with Reinforced Fine-Tuning
Trung Quoc Luong
Xinbo Zhang
Zhanming Jie
Peng Sun
Xiaoran Jin
Hang Li
OffRL
LRM
ReLM
40
87
0
17 Jan 2024
DCR: Divide-and-Conquer Reasoning for Multi-choice Question Answering with LLMs
Zijie Meng
Yan Zhang
Zhaopeng Feng
Zuozhu Liu
LRM
27
4
0
10 Jan 2024
Modeling Complex Mathematical Reasoning via Large Language Model based MathAgent
Haoran Liao
Qinyi Du
Shaohua Hu
Hao He
Yanyan Xu
Jidong Tian
Yaohui Jin
LRM
AI4CE
32
1
0
14 Dec 2023
Conceptual Engineering Using Large Language Models
Bradley Paul Allen
37
0
0
01 Dec 2023
Uncertainty-aware Language Modeling for Selective Question Answering
Qi Yang
Shreya Ravikumar
F. Schmitt-Ulms
S. Lolla
Ege Demir
...
Sadhana Lolla
Elaheh Ahmadi
Daniela Rus
Alexander Amini
Alejandro Perez
18
7
0
26 Nov 2023
Robot Learning in the Era of Foundation Models: A Survey
Xuan Xiao
Jiahang Liu
Zhipeng Wang
Yanmin Zhou
Yong Qi
Qian Cheng
Bin He
Shuo Jiang
AI4CE
LM&Ro
29
27
0
24 Nov 2023
Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models
Weize Liu
Guocong Li
Kai Zhang
Bang Du
Qiyuan Chen
Xuming Hu
Hongxia Xu
Jintai Chen
Jian Wu
LRM
18
6
0
15 Nov 2023
Towards A Unified View of Answer Calibration for Multi-Step Reasoning
Shumin Deng
Ningyu Zhang
Nay Oo
Bryan Hooi
LRM
48
2
0
15 Nov 2023
StrategyLLM: Large Language Models as Strategy Generators, Executors, Optimizers, and Evaluators for Problem Solving
Chang Gao
Haiyun Jiang
Deng Cai
Shuming Shi
Wai Lam
LRM
34
3
0
15 Nov 2023
LLMs cannot find reasoning errors, but can correct them given the error location
Gladys Tyen
Hassan Mansoor
Victor Carbune
Peter Chen
Tony Mak
LRM
19
73
0
14 Nov 2023
A Closer Look at the Self-Verification Abilities of Large Language Models in Logical Reasoning
Ruixin Hong
Hongming Zhang
Xinyu Pang
Dong Yu
Changshui Zhang
LRM
52
23
0
14 Nov 2023
VerityMath: Advancing Mathematical Reasoning by Self-Verification Through Unit Consistency
Vernon Toh
Ratish Puduppully
Nancy F. Chen
LRM
30
5
0
13 Nov 2023
An LLM can Fool Itself: A Prompt-Based Adversarial Attack
Xilie Xu
Keyi Kong
Ning Liu
Li-zhen Cui
Di Wang
Jingfeng Zhang
Mohan Kankanhalli
AAML
SILM
33
68
0
20 Oct 2023
DiagrammerGPT: Generating Open-Domain, Open-Platform Diagrams via LLM Planning
Abhaysinh Zala
Han Lin
Jaemin Cho
Mohit Bansal
40
12
0
18 Oct 2023
AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems
Junjie Zhang
Yupeng Hou
Ruobing Xie
Wenqi Sun
Julian McAuley
Wayne Xin Zhao
Leyu Lin
Ji-Rong Wen
LLMAG
22
67
0
13 Oct 2023
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
Zheng Chu
Jingchang Chen
Qianglong Chen
Weijiang Yu
Tao He
Haotian Wang
Weihua Peng
Ming-Yu Liu
Bing Qin
Ting Liu
LRM
AI4CE
37
153
0
27 Sep 2023
Chain-of-Verification Reduces Hallucination in Large Language Models
S. Dhuliawala
M. Komeili
Jing Xu
Roberta Raileanu
Xian Li
Asli Celikyilmaz
Jason Weston
LRM
HILM
22
177
0
20 Sep 2023
EchoPrompt: Instructing the Model to Rephrase Queries for Improved In-context Learning
Rajasekhar Reddy Mekala
Yasaman Razeghi
Sameer Singh
LRM
33
10
0
16 Sep 2023
A Survey on Large Language Model based Autonomous Agents
Lei Wang
Chengbang Ma
Xueyang Feng
Zeyu Zhang
Hao-ran Yang
...
Xu Chen
Yankai Lin
Wayne Xin Zhao
Zhewei Wei
Ji-Rong Wen
LLMAG
AI4CE
LM&Ro
41
1,126
0
22 Aug 2023
ZYN: Zero-Shot Reward Models with Yes-No Questions for RLAIF
Víctor Gallego
SyDa
51
4
0
11 Aug 2023
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Liangming Pan
Michael Stephen Saxon
Wenda Xu
Deepak Nathani
Xinyi Wang
William Yang Wang
KELM
LRM
47
201
0
06 Aug 2023
Enhancing Large Language Models Against Inductive Instructions with Dual-critique Prompting
Rui Wang
Hongru Wang
Fei Mi
Yi Chen
Boyang Xue
Kam-Fai Wong
Rui-Lan Xu
31
13
0
23 May 2023
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck
Varun Chandrasekaran
Ronen Eldan
J. Gehrke
Eric Horvitz
...
Scott M. Lundberg
Harsha Nori
Hamid Palangi
Marco Tulio Ribeiro
Yi Zhang
ELM
AI4MH
AI4CE
ALM
322
3,021
0
22 Mar 2023
Self-Consistency Improves Chain of Thought Reasoning in Language Models
Xuezhi Wang
Jason W. Wei
Dale Schuurmans
Quoc Le
Ed H. Chi
Sharan Narang
Aakanksha Chowdhery
Denny Zhou
ReLM
BDL
LRM
AI4CE
314
3,273
0
21 Mar 2022
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
Jason W. Wei
Xuezhi Wang
Dale Schuurmans
Maarten Bosma
Brian Ichter
F. Xia
Ed H. Chi
Quoc Le
Denny Zhou
LM&Ro
LRM
AI4CE
ReLM
398
8,559
0
28 Jan 2022
Previous
1
2