Papers
Communities
Events
Blog
Pricing
Search
Open menu
Home
Papers
2304.05128
Cited By
v1
v2 (latest)
Teaching Large Language Models to Self-Debug
11 April 2023
Xinyun Chen
Maxwell Lin
Nathanael Scharli
Denny Zhou
LRM
Re-assign community
ArXiv (abs)
PDF
HTML
Papers citing
"Teaching Large Language Models to Self-Debug"
50 / 145 papers shown
Title
Feedback Friction: LLMs Struggle to Fully Incorporate External Feedback
Dongwei Jiang
Alvin Zhang
Andrew Wang
Nicholas Andrews
Daniel Khashabi
LRM
21
0
0
13 Jun 2025
Harmonizing Geometry and Uncertainty: Diffusion with Hyperspheres
Muskan Dosi
Chiranjeev Chiranjeev
K. Thakral
Mayank Vatsa
Richa Singh
106
0
0
12 Jun 2025
Provably Learning from Language Feedback
Wanqiao Xu
Allen Nie
Ruijie Zheng
Aditya Modi
Adith Swaminathan
Ching-An Cheng
139
0
0
12 Jun 2025
PAG: Multi-Turn Reinforced LLM Self-Correction with Policy as Generative Verifier
Y. Jiang
Yuwen Xiong
Yufeng Yuan
Chao Xin
Wenyuan Xu
Yu Yue
Qianchuan Zhao
Lin Yan
LRM
112
0
0
12 Jun 2025
Learning to Reason Across Parallel Samples for LLM Reasoning
Jianing Qi
Xi Ye
Hao Tang
Zhigang Zhu
Eunsol Choi
ReLM
LRM
17
0
0
10 Jun 2025
CP-Bench: Evaluating Large Language Models for Constraint Modelling
Kostis Michailidis
Dimos Tsouros
Tias Guns
65
0
0
06 Jun 2025
Sample Complexity and Representation Ability of Test-time Scaling Paradigms
Baihe Huang
Shanda Li
Tianhao Wu
Yiming Yang
Ameet Talwalkar
Kannan Ramchandran
Michael I. Jordan
Jiantao Jiao
LRM
111
0
0
05 Jun 2025
A Reasoning-Based Approach to Cryptic Crossword Clue Solving
Martin Andrews
Sam Witteveen
ReLM
ELM
LRM
104
0
0
05 Jun 2025
Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation
Di Wu
Seth Aycock
Christof Monz
ReLM
LRM
108
0
0
05 Jun 2025
SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL
Yue Gong
Chuan Lei
X. Qin
Kapil Vaidya
Balakrishnan Narayanaswamy
Tim Kraska
31
0
0
04 Jun 2025
EvaLearn: Quantifying the Learning Capability and Efficiency of LLMs via Sequential Problem Solving
Shihan Dou
Ming Zhang
Chenhao Huang
Jiayi Chen
F. Chen
...
Wei Chengzhi
Lin Yan
Qi Zhang
Xuanjing Huang
Xuanjing Huang
ELM
77
0
0
03 Jun 2025
Pi-SQL: Enhancing Text-to-SQL with Fine-Grained Guidance from Pivot Programming Languages
Yongdong chi
Hanqing Wang
Zonghan Yang
Jian Yang
Xiao Yan
Yun-Nung Chen
Guanhua Chen
37
0
0
01 Jun 2025
SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL
Ge Qu
Jinyang Li
Bowen Qin
Xiaolong Li
Nan Huo
Chenhao Ma
Reynold Cheng
20
0
0
31 May 2025
CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction
Jiali Chen
Xusen Hei
HongFei Liu
Yuancheng Wei
Zikun Deng
Jiayuan Xie
Yi Cai
Li Qing
55
0
0
28 May 2025
BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
Qinzhuo Wu
Pengzhi Gao
Wei Liu
Jian Luan
LLMAG
57
0
0
27 May 2025
HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation
Weizhi Tang
Yixuan Li
Chris Sypherd
Elizabeth Polgreen
Vaishak Belle
45
0
0
22 May 2025
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Shivam Agarwal
Zimin Zhang
Lifan Yuan
Jiawei Han
Hao Peng
162
8
0
21 May 2025
R&D-Agent-Quant: A Multi-Agent Framework for Data-Centric Factors and Model Joint Optimization
Yuante Li
Xu Yang
Xiao Yang
Minrui Xu
Xisen Wang
Weiqing Liu
Jiang Bian
AIFin
261
0
0
21 May 2025
Sense and Sensitivity: Examining the Influence of Semantic Recall on Long Context Code Reasoning
Adam Štorek
Mukur Gupta
Samira Hajizadeh
Prashast Srivastava
Suman Jana
LRM
69
0
0
19 May 2025
Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Berkcan Kapusuzoglu
Supriyo Chakraborty
Chia-Hsuan Lee
Sambit Sahu
123
0
0
16 May 2025
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning
Yige Xu
Xu Guo
Zhiwei Zeng
Chunyan Miao
BDL
LRM
146
1
0
16 May 2025
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning
Yoichi Ishibashi
Taro Yano
Masafumi Oyamada
SyDa
LRM
106
2
0
15 May 2025
Towards Effectively Leveraging Execution Traces for Program Repair with Code LLMs
Mirazul Haque
Petr Babkin
Farima Farmahinifarahani
Manuela Veloso
58
0
0
07 May 2025
QiMeng-Xpiler: Transcompiling Tensor Programs for Deep Learning Systems with a Neural-Symbolic Approach
Shouyang Dong
Yuanbo Wen
Jun Bi
Di Huang
Jiaming Guo
...
Yifan Hao
Xuehai Zhou
Tianshi Chen
Qi Guo
Yunji Chen
43
1
0
04 May 2025
CrashFixer: A crash resolution agent for the Linux kernel
Alex Mathai
Chenxi Huang
Suwei Ma
Jihwan Kim
Hailie Mitchell
Aleksandr Nogikh
Petros Maniatis
Franjo Ivančić
Junfeng Yang
Baishakhi Ray
131
1
0
29 Apr 2025
Think2SQL: Reinforce LLM Reasoning Capabilities for Text2SQL
Simone Papicchio
Simone Rossi
Luca Cagliero
Paolo Papotti
ReLM
LMTD
AI4TS
LRM
128
1
0
21 Apr 2025
DocAgent: A Multi-Agent System for Automated Code Documentation Generation
Dayu Yang
Antoine Simoulin
Xin Qian
Xiaoyi Liu
Yuwei Cao
Zhaopu Teng
Grey Yang
LLMAG
147
0
0
11 Apr 2025
Synthesizing High-Quality Programming Tasks with LLM-based Expert and Student Agents
Manh Hung Nguyen
Victor-Alexandru Pădurean
Alkis Gotovos
Sebastian Tschiatschek
Adish Singla
69
0
0
10 Apr 2025
Cognitive Debiasing Large Language Models for Decision-Making
Yougang Lyu
Shijie Ren
Yue Feng
Zihan Wang
Zhongfu Chen
Zhaochun Ren
Maarten de Rijke
239
0
0
05 Apr 2025
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows
Souradip Chakraborty
Mohammadreza Pourreza
Ruoxi Sun
Yiwen Song
Nino Scherrer
...
Furong Huang
Amrit Singh Bedi
Ahmad Beirami
Hamid Palangi
Tomas Pfister
129
2
0
02 Apr 2025
On Benchmarking Code LLMs for Android Malware Analysis
Yiling He
Hongyu She
Xingzhi Qian
Xinran Zheng
Zhuo Chen
Zhan Qin
Lorenzo Cavallaro
ELM
120
1
0
01 Apr 2025
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
Zhuoshi Pan
Yu Li
Honglin Lin
Qizhi Pei
Zinan Tang
Wei Wu
Chenlin Ming
H. Vicky Zhao
Zeang Sheng
Lijun Wu
LRM
152
6
0
21 Mar 2025
FutureGen: LLM-RAG Approach to Generate the Future Work of Scientific Article
Ibrahim Al Azher
Miftahul Jannat Mokarrama
Zhishuai Guo
Sagnik Ray Choudhury
Hamed Alhoori
LLMAG
106
2
0
20 Mar 2025
DAPO: An Open-Source LLM Reinforcement Learning System at Scale
Qiying Yu
Zheng Zhang
Ruofei Zhu
Yufeng Yuan
Xiaochen Zuo
...
Ya Zhang
Lin Yan
Mu Qiao
Yonghui Wu
Mingxuan Wang
OffRL
LRM
243
217
0
18 Mar 2025
LLMPerf: GPU Performance Modeling meets Large Language Models
Khoi N.M. Nguyen
Hoang Duy Nguyen Do
Huyen Thao Le
T. Dao
82
0
0
14 Mar 2025
Fine-Tuning Diffusion Generative Models via Rich Preference Optimization
Hanyang Zhao
Haoxian Chen
Yucheng Guo
Genta Indra Winata
Tingting Ou
Ziyu Huang
D. Yao
Wenpin Tang
130
0
0
13 Mar 2025
EditLord: Learning Code Transformation Rules for Code Editing
Weichen Li
Albert Jan
Baishakhi Ray
Junfeng Yang
Chengzhi Mao
Kexin Pei
KELM
62
2
0
10 Mar 2025
Fully Autonomous Programming using Iterative Multi-Agent Debugging with Large Language Models
Anastasiia Grishina
Vadim Liventsev
Aki Härmä
Leon Moonen
ELM
155
0
0
10 Mar 2025
Exploiting Edited Large Language Models as General Scientific Optimizers
Qitan Lv
T. Liu
Haoyu Wang
174
1
0
08 Mar 2025
AutoIOT: LLM-Driven Automated Natural Language Programming for AIoT Applications
Leming Shen
Qiang Yang
Yuanqing Zheng
Mo Li
102
3
0
07 Mar 2025
From Voice to Safety: Language AI Powered Pilot-ATC Communication Understanding for Airport Surface Movement Collision Risk Assessment
Yutian Pang
Andrew Paul Kendall
Alex Porcayo
Mariah Barsotti
Anahita Jain
John-Paul Clarke
111
0
0
06 Mar 2025
LLMs Can Generate a Better Answer by Aggregating Their Own Responses
Zichong Li
Xinyu Feng
Yuheng Cai
Zixuan Zhang
Tianyi Liu
Chen Liang
Weizhu Chen
Haoyu Wang
Tiejun Zhao
LRM
115
2
0
06 Mar 2025
Unified Mind Model: Reimagining Autonomous Agents in the LLM Era
Pengbo Hu
Xiang Ying
LLMAG
LM&Ro
AI4CE
139
1
0
05 Mar 2025
Learning from Failures in Multi-Attempt Reinforcement Learning
Stephen Chung
Wenyu Du
Jie Fu
LRM
90
3
0
04 Mar 2025
Generator-Assistant Stepwise Rollback Framework for Large Language Model Agent
Xingzuo Li
Kehai Chen
Yunfei Long
X. Bai
Yong-mei Xu
Min Zhang
LLMAG
LRM
125
1
0
04 Mar 2025
The Power of Personality: A Human Simulation Perspective to Investigate Large Language Model Agents
Yifan Duan
Yihong Tang
Xuefeng Bai
Kehai Chen
Junlin Li
Min Zhang
LLMAG
532
2
0
28 Feb 2025
ConvCodeWorld: Benchmarking Conversational Code Generation in Reproducible Feedback Environments
Hojae Han
Seung-won Hwang
Rajhans Samdani
Yuxiong He
ALM
109
4
0
27 Feb 2025
Multi-Turn Code Generation Through Single-Step Rewards
A. Jain
Gonzalo Gonzalez-Pumariega
Wayne Chen
Alexander M. Rush
Wenting Zhao
Sanjiban Choudhury
LRM
83
3
0
27 Feb 2025
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models
Hongzhan Lin
Yang Deng
Yuxuan Gu
Wenxuan Zhang
Jing Ma
See-Kiong Ng
Tat-Seng Chua
LLMAG
KELM
HILM
144
1
0
25 Feb 2025
DISC: DISC: Dynamic Decomposition Improves LLM Inference Scaling
Jonathan Light
Wei Cheng
Benjamin Rivière
Wu Yue
Masafumi Oyamada
Mengdi Wang
Yisong Yue
Santiago Paternain
Haifeng Chen
ReLM
LRM
127
2
0
23 Feb 2025
1
2
3
Next