Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2308.00436
Cited By
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
1 August 2023
Ning Miao
Yee Whye Teh
Tom Rainforth
ReLM
LRM
Re-assign community
ArXiv
PDF
HTML
Papers citing
"SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning"
50 / 88 papers shown
Title
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
Yufei Xiang
Yiqun Shen
Yeqin Zhang
Cam-Tu Nguyen
OffRL
LLMAG
KELM
LRM
14
0
0
17 May 2025
Safer Prompts: Reducing IP Risk in Visual Generative AI
Lena Reissinger
Yuanyuan Li
Anna-Carolina Haensch
Neeraj Sarna
33
0
0
06 May 2025
LogiDebrief: A Signal-Temporal Logic based Automated Debriefing Approach with Large Language Models Integration
Zirong Chen
Ziyan An
Jennifer Reynolds
Kristin Mullen
Stephen Martini
Meiyi Ma
34
0
0
06 May 2025
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems
Shaokun Zhang
Ming Yin
Jieyu Zhang
Jing Liu
Zhiguang Han
...
Beibin Li
Chi Wang
H. Wang
Yuhang Chen
Qingyun Wu
49
1
0
30 Apr 2025
Comparing Uncertainty Measurement and Mitigation Methods for Large Language Models: A Systematic Review
Toghrul Abbasli
Kentaroh Toyoda
Yuan Wang
Leon Witt
Muhammad Asif Ali
Yukai Miao
Dan Li
Qingsong Wei
UQCV
92
0
0
25 Apr 2025
Perception in Reflection
Yana Wei
Liang Zhao
Kangheng Lin
En Yu
Yuang Peng
...
Jianjian Sun
Haoran Wei
Zheng Ge
Xiangyu Zhang
Vishal M. Patel
31
0
0
09 Apr 2025
KSHSeek: Data-Driven Approaches to Mitigating and Detecting Knowledge-Shortcut Hallucinations in Generative Models
Z. Wang
Zhongxin Liu
Ying Li
Hongyu Sun
Meng Xu
Yuqing Zhang
HILM
53
0
0
25 Mar 2025
J&H: Evaluating the Robustness of Large Language Models Under Knowledge-Injection Attacks in Legal Domain
Yiran Hu
Huanghai Liu
Qingjing Chen
Ning Zheng
C. Wang
Yun Liu
Charles L.A. Clarke
Weixing Shen
AAML
AILaw
ELM
72
0
0
24 Mar 2025
A Survey on Mathematical Reasoning and Optimization with Large Language Models
Ali Forootani
OffRL
LRM
AI4CE
45
0
0
22 Mar 2025
Temporal Consistency for LLM Reasoning Process Error Identification
Jiacheng Guo
Yue Wu
Jiahao Qiu
Kaixuan Huang
Xinzhe Juan
L. Yang
Mengdi Wang
LRM
63
1
0
18 Mar 2025
CoT-UQ: Improving Response-wise Uncertainty Quantification in LLMs with Chain-of-Thought
Boxuan Zhang
Ruqi Zhang
LRM
32
1
0
24 Feb 2025
A Survey on Feedback-based Multi-step Reasoning for Large Language Models on Mathematics
Ting-Ruen Wei
Haowei Liu
Xuyang Wu
Yi Fang
LRM
AI4CE
ReLM
KELM
202
1
0
21 Feb 2025
Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment
Cheryl Li
Tianyuan Xu
Yiwen Guo
LRM
170
2
0
05 Feb 2025
Mathematical Language Models: A Survey
Wei Liu
Hanglei Hu
Jie Zhou
Yuyang Ding
Junsong Li
...
Mengliang He
Qin Chen
Bo Jiang
Aimin Zhou
Liang He
LRM
79
12
0
03 Jan 2025
Formal Mathematical Reasoning: A New Frontier in AI
Kaiyu Yang
Gabriel Poesia
Jingxuan He
Wenda Li
Kristin Lauter
Swarat Chaudhuri
Dawn Song
LRM
AI4CE
82
21
0
20 Dec 2024
EscapeBench: Pushing Language Models to Think Outside the Box
Cheng Qian
Peixuan Han
Qinyu Luo
Bingxiang He
Xiusi Chen
...
Jiarui Yao
Xiaocheng Yang
Denghui Zhang
Yunzhu Li
Heng Ji
LLMAG
LRM
90
3
0
18 Dec 2024
Improving Physics Reasoning in Large Language Models Using Mixture of Refinement Agents
Raj Jaiswal
Dhruv Jain
Harsh Parimal Popat
Avinash Anand
Abhishek Dharmadhikari
Atharva Marathe
R. Shah
LRM
AI4CE
90
3
0
01 Dec 2024
Teaching Models to Improve on Tape
L. Bezalel
Eyal Orgad
Amir Globerson
37
0
0
03 Nov 2024
Plan-on-Graph: Self-Correcting Adaptive Planning of Large Language Model on Knowledge Graphs
L. Chen
Panrong Tong
Zhongming Jin
Ying Sun
Jieping Ye
Hui Xiong
KELM
RALM
LRM
45
7
0
31 Oct 2024
Coarse-to-Fine Highlighting: Reducing Knowledge Hallucination in Large Language Models
Qitan Lv
Jie Wang
Hanzhu Chen
Bin Li
Yongdong Zhang
Feng Wu
HILM
28
3
0
19 Oct 2024
Nova: An Iterative Planning and Search Approach to Enhance Novelty and Diversity of LLM Generated Ideas
Xiang Hu
Hongyu Fu
Jinge Wang
Yifeng Wang
Zhikun Li
Renjun Xu
Yu Lu
Yaochu Jin
Lili Pan
Zhenzhong Lan
LRM
27
13
0
18 Oct 2024
Better to Ask in English: Evaluation of Large Language Models on English, Low-resource and Cross-Lingual Settings
Krishno Dey
Prerona Tarannum
Md. Arid Hasan
Imran Razzak
Usman Naseem
35
3
0
17 Oct 2024
Enhancing Mathematical Reasoning in LLMs by Stepwise Correction
Zhenyu Wu
Qingkai Zeng
Zhiwei Zhang
Zhaoxuan Tan
Chao Shen
Meng Jiang
KELM
LRM
41
4
0
16 Oct 2024
Varying Shades of Wrong: Aligning LLMs with Wrong Answers Only
Jihan Yao
Wenxuan Ding
Shangbin Feng
Lucy Lu Wang
Yulia Tsvetkov
32
0
0
14 Oct 2024
SuperCorrect: Advancing Small LLM Reasoning with Thought Template Distillation and Self-Correction
L. Yang
Zhaochen Yu
Tao Zhang
Minkai Xu
Joseph E. Gonzalez
Tengjiao Wang
Shuicheng Yan
ELM
ReLM
LRM
51
0
0
11 Oct 2024
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
48
9
0
11 Oct 2024
AutoFeedback: An LLM-based Framework for Efficient and Accurate API Request Generation
Huanxi Liu
Jiaqi Liao
Dawei Feng
Kele Xu
Huaimin Wang
118
0
0
09 Oct 2024
O1 Replication Journey: A Strategic Progress Report -- Part 1
Yiwei Qin
Xuefeng Li
Haoyang Zou
Yixiu Liu
Shijie Xia
...
Yixin Ye
Weizhe Yuan
Hector Liu
Yuan Li
Pengfei Liu
VLM
48
68
0
08 Oct 2024
Rationale-Aware Answer Verification by Pairwise Self-Evaluation
Akira Kawabata
Saku Sugawara
LRM
36
3
0
07 Oct 2024
Mirror-Consistency: Harnessing Inconsistency in Majority Voting
Siyuan Huang
Zhiyuan Ma
Jintao Du
Changhua Meng
Weiqiang Wang
Zhouhan Lin
LRM
34
4
0
07 Oct 2024
Ordinal Preference Optimization: Aligning Human Preferences via NDCG
Yang Zhao
Yixin Wang
Mingzhang Yin
28
2
0
06 Oct 2024
DANA: Domain-Aware Neurosymbolic Agents for Consistency and Accuracy
Vinh Luong
Sang Dinh
Shruti Raghavan
William Nguyen
Zooey Nguyen
...
Kentaro Maegaito
Loc Nguyen
Thao Nguyen
Anh Hai Ha
Christopher Nguyen
28
0
0
27 Sep 2024
Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu
Weihao Yu
Xinchao Wang
VLM
40
6
0
25 Sep 2024
A Multiple-Fill-in-the-Blank Exam Approach for Enhancing Zero-Resource Hallucination Detection in Large Language Models
Satoshi Munakata
Taku Fukui
Takao Mohri
21
0
0
20 Sep 2024
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings
Md. Arid Hasan
Prerona Tarannum
Krishno Dey
Imran Razzak
Usman Naseem
31
4
0
05 Aug 2024
LLM-Empowered State Representation for Reinforcement Learning
Boyuan Wang
Yun Qu
Yuhang Jiang
Jianzhun Shao
Chang-rui Liu
Wenming Yang
Xiangyang Ji
40
7
0
18 Jul 2024
Towards Collaborative Intelligence: Propagating Intentions and Reasoning for Multi-Agent Coordination with Large Language Models
Xihe Qiu
Haoyu Wang
Xiaoyu Tan
Chao Qu
Yujie Xiong
Yuan Cheng
Yinghui Xu
Wei Chu
Yuan Qi
LLMAG
LM&Ro
43
0
0
17 Jul 2024
Speech-Copilot: Leveraging Large Language Models for Speech Processing via Task Decomposition, Modularization, and Program Generation
Chun-Yi Kuan
Chih-Kai Yang
Wei-Ping Huang
Ke-Han Lu
Hung-yi Lee
46
5
0
13 Jul 2024
Large Language Model as an Assignment Evaluator: Insights, Feedback, and Challenges in a 1000+ Student Course
Cheng-Han Chiang
Wei-Chih Chen
Chun-Yi Kuan
Chienchou Yang
Hung-yi Lee
ELM
AI4Ed
43
5
0
07 Jul 2024
DynaThink: Fast or Slow? A Dynamic Decision-Making Framework for Large Language Models
Jiabao Pan
Yan Zhang
Chen Zhang
Zuozhu Liu
Hongwei Wang
Haizhou Li
LRM
37
3
0
01 Jul 2024
Word Matters: What Influences Domain Adaptation in Summarization?
Yinghao Li
Siyu Miao
Heyan Huang
Yang Gao
46
3
0
21 Jun 2024
Do LLMs Exhibit Human-Like Reasoning? Evaluating Theory of Mind in LLMs for Open-Ended Responses
Maryam Amirizaniani
Elias Martin
Maryna Sivachenko
A. Mashhadi
Chirag Shah
LRM
42
12
0
09 Jun 2024
Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning
Xiaohu Du
Ming Wen
Jiahao Zhu
Zifan Xie
Bin Ji
Huijun Liu
Xuanhua Shi
Hai Jin
37
14
0
06 Jun 2024
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
Ryo Kamoi
Yusen Zhang
Nan Zhang
Jiawei Han
Rui Zhang
LRM
50
57
0
03 Jun 2024
A Survey of Useful LLM Evaluation
Ji-Lun Peng
Sijia Cheng
Egil Diau
Yung-Yu Shih
Po-Heng Chen
Yen-Ting Lin
Yun-Nung Chen
LLMAG
ELM
34
12
0
03 Jun 2024
Preemptive Answer "Attacks" on Chain-of-Thought Reasoning
Rongwu Xu
Zehan Qi
Wei Xu
LRM
SILM
64
6
0
31 May 2024
Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
Haibo Jin
Andy Zhou
Joe D. Menke
Haohan Wang
38
11
0
30 May 2024
Large Language Models Can Self-Correct with Minimal Effort
Zhenyu Wu
Qingkai Zeng
Zhihan Zhang
Zhaoxuan Tan
Chao Shen
Meng Jiang
KELM
LRM
ReLM
46
3
0
23 May 2024
LLMs can Find Mathematical Reasoning Mistakes by Pedagogical Chain-of-Thought
Zhuoxuan Jiang
Haoyuan Peng
Shanshan Feng
Fan Li
Dongsheng Li
LRM
KELM
46
12
0
09 May 2024
Self-Reflection in LLM Agents: Effects on Problem-Solving Performance
Matthew Renze
Erhan Guven
LRM
LLMAG
44
37
0
05 May 2024
1
2
Next