Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.05128
Cited By
v1
v2 (latest)
Teaching Large Language Models to Self-Debug
11 April 2023
Xinyun Chen
Maxwell Lin
Nathanael Scharli
Denny Zhou
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Teaching Large Language Models to Self-Debug"
50 / 145 papers shown
Title
Patterns Over Principles: The Fragility of Inductive Reasoning in LLMs under Noisy Observations
Chunyang Li
Weiqi Wang
Tianshi Zheng
Yangqiu Song
LRM
134
6
0
22 Feb 2025
Evolving Symbolic 3D Visual Grounder with Weakly Supervised Reflection
Boyu Mi
Hanqing Wang
Tai Wang
Yilun Chen
Jiangmiao Pang
135
0
0
21 Feb 2025
ARS: Automatic Routing Solver with Large Language Models
Kai Li
Fei Liu
Zhenkun Wang
Xialiang Tong
Xiongwei Han
Mingxuan Yuan
Qingfu Zhang
116
0
0
21 Feb 2025
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarcity
Dylan Zhang
Justin Wang
Tianran Sun
122
1
0
17 Feb 2025
SQL-o1: A Self-Reward Heuristic Dynamic Search Method for Text-to-SQL
Shuai Lyu
Haoran Luo
Zhonghong Ou
Zhonghong Ou
Jiangfeng Sun
Yang Qin
Xiaoran Shang
Meina Song
Yifan Zhu
AI4TS
LRM
154
5
0
17 Feb 2025
LeDex: Training LLMs to Better Self-Debug and Explain Code
Nan Jiang
Xiaopeng Li
Shiqi Wang
Qiang Zhou
Soneya Binta Hossain
Baishakhi Ray
Varun Kumar
Xiaofei Ma
Anoop Deoras
LRM
168
19
0
17 Feb 2025
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
Zhiyuan Zeng
Qinyuan Cheng
Zhangyue Yin
Yunhua Zhou
Xipeng Qiu
LRM
178
20
0
17 Feb 2025
Flaming-hot Initiation with Regular Execution Sampling for Large Language Models
Weizhe Chen
Zhicheng Zhang
Guanlin Liu
Renjie Zheng
Wenlei Shi
Chen Dun
Zheng Wu
Xing Jin
Lin Yan
ALM
LRM
181
3
0
17 Feb 2025
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
Leonardo Bertolazzi
Philipp Mondorf
Yun Xue
Raffaella Bernardi
AIFin
LRM
119
0
0
17 Feb 2025
RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents
Weizhe Chen
Sven Koenig
B. Dilkina
LLMAG
214
12
0
17 Feb 2025
RefineCoder: Iterative Improving of Large Language Models via Adaptive Critique Refinement for Code Generation
C. Zhou
Xinyu Zhang
Dandan Song
Xiancai Chen
Wanli Gu
Huipeng Ma
Yuhang Tian
Hao Fei
Linmei Hu
98
2
0
13 Feb 2025
CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging
Md. Ashraful Islam
Mohammed Eunus Ali
Md. Rizwan Parvez
LLMAG
159
4
0
08 Feb 2025
Iterative Deepening Sampling as Efficient Test-Time Scaling
Weizhe Chen
Sven Koenig
B. Dilkina
LRM
ReLM
154
1
0
08 Feb 2025
Disproving Program Equivalence with LLMs
Miltiadis Allamanis
Pengcheng Yin
170
0
0
05 Feb 2025
Reasoning-as-Logic-Units: Scaling Test-Time Reasoning in Large Language Models Through Logic Unit Alignment
Cheryl Li
Tianyuan Xu
Yiwen Guo
LRM
473
3
0
05 Feb 2025
Self-Improving Transformers Overcome Easy-to-Hard and Length Generalization Challenges
Nayoung Lee
Ziyang Cai
Avi Schwarzschild
Kangwook Lee
Dimitris Papailiopoulos
ReLM
VLM
LRM
AI4CE
166
7
0
03 Feb 2025
Learning to Generate Unit Tests for Automated Debugging
Archiki Prasad
Elias Stengel-Eskin
Justin Chih-Yao Chen
Zaid Khan
Joey Tianyi Zhou
ELM
172
4
0
03 Feb 2025
Towards Advancing Code Generation with Large Language Models: A Research Roadmap
Haolin Jin
Huaming Chen
Qinghua Lu
Liming Zhu
LLMAG
130
2
0
20 Jan 2025
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Junyu Chen
Han Cai
Junsong Chen
Enze Xie
Shang Yang
Haotian Tang
Zhekai Zhang
Yaojie Lu
Song Han
DiffM
174
7
0
20 Jan 2025
QualityFlow: An Agentic Workflow for Program Synthesis Controlled by LLM Quality Checks
Yaojie Hu
Qiang Zhou
Qihong Chen
Xiaopeng Li
Linbo Liu
Dejiao Zhang
Amit Kachroo
Talha Oz
Omer Tripp
171
7
0
20 Jan 2025
Understanding Before Reasoning: Enhancing Chain-of-Thought with Iterative Summarization Pre-Prompting
Dong-Hai Zhu
Yu-Jie Xiong
Jia-Chen Zhang
Xi-Jiong Xie
Chun-Ming Xia
ReLM
LRM
63
0
0
08 Jan 2025
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Tianyu Zheng
Ge Zhang
Tianhao Shen
Xueling Liu
Bill Yuchen Lin
Jie Fu
Wenhu Chen
Xiang Yue
SyDa
201
131
0
08 Jan 2025
Recursive Decomposition of Logical Thoughts: Framework for Superior Reasoning and Knowledge Propagation in Large Language Models
Kaleem Ullah Qasim
Jiashu Zhang
Tariq Alsahfi
Ateeq Ur Rehman Butt
LRM
ReLM
134
1
0
03 Jan 2025
Reasoning Through Execution: Unifying Process and Outcome Rewards for Code Generation
Zhuohao Yu
Weizheng Gu
Yidong Wang
Xingru Jiang
Zhengran Zeng
Jindong Wang
Wei Ye
Shikun Zhang
LRM
187
5
0
19 Dec 2024
Time-Reversal Provides Unsupervised Feedback to LLMs
Yerram Varun
Rahul Madhavan
Sravanti Addepalli
A. Suggala
Karthikeyan Shanmugam
Prateek Jain
LRM
SyDa
97
0
0
03 Dec 2024
VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning
Xueqing Wu
Yuheng Ding
Bingxuan Li
Pan Lu
Da Yin
Kai-Wei Chang
Nanyun Peng
LRM
152
4
0
03 Dec 2024
Planning-Driven Programming: A Large Language Model Programming Workflow
Chao Lei
Yanchuan Chang
Nir Lipovetzky
Krista A. Ehinger
206
6
0
21 Nov 2024
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Fangyu Lei
Jixuan Chen
Yuxiao Ye
Ruisheng Cao
Dongchan Shin
...
Caiming Xiong
Ruoxi Sun
Qian Liu
Sida I. Wang
Tao Yu
LMTD
128
37
0
12 Nov 2024
Grounding Natural Language to SQL Translation with Data-Based Self-Explanations
Yuankai Fan
Tonghui Ren
Can Huang
Zhenying He
Xinyu Wang
LRM
130
2
0
05 Nov 2024
Smaller Large Language Models Can Do Moral Self-Correction
Guangliang Liu
Zhiyu Xue
Rongrong Wang
K. Johnson
Kristen Marie Johnson
LRM
98
0
0
30 Oct 2024
Automated Proof Generation for Rust Code via Self-Evolution
Tianyu Chen
Shuai Lu
Shan Lu
Yeyun Gong
Chenyuan Yang
...
Peng Cheng
Fan Yang
Shuvendu Lahiri
Tao Xie
Lidong Zhou
146
10
0
21 Oct 2024
MCCoder: Streamlining Motion Control with LLM-Assisted Code Generation and Rigorous Verification
Yin Li
Liangwei Wang
Shiyuan Piao
Boo-Ho Yang
Ziyue Li
Wei Zeng
Fugee Tsung
83
0
0
19 Oct 2024
MCQG-SRefine: Multiple Choice Question Generation and Evaluation with Iterative Self-Critique, Correction, and Comparison Feedback
Zonghai Yao
Aditya Parashar
Huixue Zhou
Won Seok Jang
Feiyun Ouyang
Zhichao Yang
Hong-ye Yu
ELM
137
2
0
17 Oct 2024
LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
Caigao Jiang
Xiang Shu
Hong Qian
Xingyu Lu
Jun-ping Zhou
Aimin Zhou
Yang Yu
129
8
0
17 Oct 2024
MSc-SQL: Multi-Sample Critiquing Small Language Models For Text-To-SQL Translation
S. Gorti
Ilan Gofman
Zhaoyan Liu
Jiapeng Wu
Noël Vouitsis
Guangwei Yu
Jesse C. Cresswell
Rasa Hosseinzadeh
SyDa
140
12
0
16 Oct 2024
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization
Guanlin Liu
Kaixuan Ji
Ning Dai
Zheng Wu
Chen Dun
Q. Gu
Lin Yan
Quanquan Gu
Lin Yan
OffRL
LRM
143
13
0
11 Oct 2024
Divide and Translate: Compositional First-Order Logic Translation and Verification for Complex Logical Reasoning
Hyun Ryu
Gyeongman Kim
Hyemin S. Lee
Eunho Yang
LRM
148
10
0
10 Oct 2024
Generating CAD Code with Vision-Language Models for 3D Designs
Kamel Alrashedy
Pradyumna Tambwekar
Z. Zaidi
Megan Langwasser
Wei Xu
Matthew Gombolay
101
13
0
07 Oct 2024
RGD: Multi-LLM Based Agent Debugger via Refinement and Generation Guidance
Haolin Jin
Zechao Sun
Huaming Chen
LLMAG
79
2
0
02 Oct 2024
Aligning Language Models Using Follow-up Likelihood as Reward Signal
Chen Zhang
Dading Chong
Feng Jiang
Chengguang Tang
Anningzhe Gao
Guohua Tang
Haizhou Li
ALM
105
2
0
20 Sep 2024
CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution
Ruiyang Xu
Jialun Cao
Yaojie Lu
Ming Wen
Hongyu Lin
Xianpei Han
Ben He
Shing-Chi Cheung
Le Sun
LRM
ELM
105
6
0
23 Aug 2024
Selective Prompt Anchoring for Code Generation
Yuan Tian
Tianyi Zhang
263
3
0
17 Aug 2024
From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future
Haolin Jin
Linghan Huang
Haipeng Cai
Jun Yan
Bo Li
Huaming Chen
158
37
0
05 Aug 2024
A Survey on Employing Large Language Models for Text-to-SQL Tasks
Liang Shi
Zhengju Tang
Nan Zhang
Xiaotong Zhang
Zhi Yang
204
31
0
21 Jul 2024
Learning to Refine with Fine-Grained Natural Language Feedback
Manya Wadhwa
Xinyu Zhao
Junyi Jessy Li
Greg Durrett
100
16
0
02 Jul 2024
Automatically Generating UI Code from Screenshot: A Divide-and-Conquer-Based Approach
Yuxuan Wan
Chaozheng Wang
Yi Dong
Wenxuan Wang
Shuqing Li
Yintong Huo
Michael R. Lyu
3DV
172
14
0
24 Jun 2024
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo
Minh Chien Vu
Jenny Chim
Han Hu
Wenhao Yu
...
David Lo
Daniel Fried
Xiaoning Du
H. D. Vries
Leandro von Werra
235
193
0
22 Jun 2024
Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL
Zijin Hong
Zheng Yuan
Qinggang Zhang
Hao Chen
Junnan Dong
Feiran Huang
Xiao Huang
191
74
0
12 Jun 2024
Teaching Language Models to Self-Improve by Learning from Language Feedback
Chi Hu
Yimin Hu
Hang Cao
Tong Xiao
Jingbo Zhu
LRM
VLM
79
5
0
11 Jun 2024
Learning Task Decomposition to Assist Humans in Competitive Programming
Jiaxin Wen
Ruiqi Zhong
Pei Ke
Zhihong Shao
Hongning Wang
Minlie Huang
ReLM
126
9
0
07 Jun 2024
Previous
1
2
3
Next