ResearchTrend.AI
  • Papers
  • Communities
  • Organizations
  • Events
  • Blog
  • Pricing
Papers
Communities
Social Events
Terms and Conditions
Pricing
Parameter LabParameter LabTwitterGitHubLinkedInBlueskyYoutube

© 2025 ResearchTrend.AI, All rights reserved.

  1. Home
  2. Papers
  3. 2303.17651
  4. Cited By
Self-Refine: Iterative Refinement with Self-Feedback
v1v2 (latest)

Self-Refine: Iterative Refinement with Self-Feedback

30 March 2023
Aman Madaan
Niket Tandon
Prakhar Gupta
Skyler Hallinan
Luyu Gao
Sarah Wiegreffe
Uri Alon
Nouha Dziri
Shrimai Prabhumoye
Yiming Yang
Shashank Gupta
Bodhisattwa Prasad Majumder
Katherine Hermann
Sean Welleck
Amir Yazdanbakhsh
Peter Clark
    ReLMLRMDiffM
ArXiv (abs)PDFHTML

Papers citing "Self-Refine: Iterative Refinement with Self-Feedback"

50 / 431 papers shown
Title
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness
Revisiting Multi-Agent Debate as Test-Time Scaling: A Systematic Study of Conditional Effectiveness
Yongjin Yang
Euiin Yi
Jongwoo Ko
Kimin Lee
Zhijing Jin
Se-Young Yun
LLMAG
64
0
0
29 May 2025
Sherlock: Self-Correcting Reasoning in Vision-Language Models
Sherlock: Self-Correcting Reasoning in Vision-Language Models
Yi Ding
Ruqi Zhang
ReLMLRMVLM
110
0
0
28 May 2025
Text2Grad: Reinforcement Learning from Natural Language Feedback
Text2Grad: Reinforcement Learning from Natural Language Feedback
Hanyang Wang
Lu Wang
Chaoyun Zhang
Tianjun Mao
Si Qin
Qingwei Lin
Saravan Rajmohan
Dongmei Zhang
88
0
0
28 May 2025
Advancing Expert Specialization for Better MoE
Advancing Expert Specialization for Better MoE
Hongcan Guo
Haolang Lu
Guoshun Nan
Bolun Chu
Jialin Zhuang
Yuan Yang
Wenhao Che
Sicong Leng
Qimei Cui
Xudong Jiang
MoEMoMe
103
0
0
28 May 2025
Self-Critique and Refinement for Faithful Natural Language Explanations
Self-Critique and Refinement for Faithful Natural Language Explanations
Yingming Wang
Pepa Atanasova
LRM
139
0
0
28 May 2025
A Large Language Model-Enabled Control Architecture for Dynamic Resource Capability Exploration in Multi-Agent Manufacturing Systems
A Large Language Model-Enabled Control Architecture for Dynamic Resource Capability Exploration in Multi-Agent Manufacturing Systems
Jonghan Lim
Ilya Kovalenko
AI4CE
51
0
0
28 May 2025
Step-Wise Formal Verification for LLM-Based Mathematical Problem Solving
Step-Wise Formal Verification for LLM-Based Mathematical Problem Solving
Kuo Zhou
Lu Zhang
LRM
90
0
0
27 May 2025
Pretraining Language Models to Ponder in Continuous Space
Pretraining Language Models to Ponder in Continuous Space
Boyi Zeng
Shixiang Song
Siyuan Huang
Yixuan Wang
He Li
Ziwei He
Xinbing Wang
Zhiyu Li
Zhouhan Lin
LRM
105
0
0
27 May 2025
BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
BacktrackAgent: Enhancing GUI Agent with Error Detection and Backtracking Mechanism
Qinzhuo Wu
Pengzhi Gao
Wei Liu
Jian Luan
LLMAG
63
0
0
27 May 2025
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
R1-Code-Interpreter: Training LLMs to Reason with Code via Supervised and Reinforcement Learning
Yongchao Chen
Y. Liu
Junwei Zhou
Yilun Hao
Jingquan Wang
Yang Zhang
Chuchu Fan
OffRLReLMAI4TSSyDaALMLRM
81
0
0
27 May 2025
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling
Factual Self-Awareness in Language Models: Representation, Robustness, and Scaling
Hovhannes Tamoyan
Subhabrata Dutta
Iryna Gurevych
HILMKELM
62
0
0
27 May 2025
Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models
Automatic Transmission for LLM Tiers: Optimizing Cost and Accuracy in Large Language Models
Injae Na
Keonwoong Noh
Woohwan Jung
76
0
0
27 May 2025
Can Large Reasoning Models Self-Train?
Can Large Reasoning Models Self-Train?
Sheikh Shafayat
Fahim Tajwar
Ruslan Salakhutdinov
J. Schneider
Andrea Zanette
ReLMOffRLLRM
87
2
0
27 May 2025
MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning
MIRROR: Multi-agent Intra- and Inter-Reflection for Optimized Reasoning in Tool Learning
Zikang Guo
Benfeng Xu
Xiaorui Wang
Zhendong Mao
98
0
0
27 May 2025
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Can Compressed LLMs Truly Act? An Empirical Evaluation of Agentic Capabilities in LLM Compression
Peijie Dong
Zhenheng Tang
Xiang Liu
Lujun Li
Xiaowen Chu
Bo Li
115
0
0
26 May 2025
Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering
Self-Reflective Planning with Knowledge Graphs: Enhancing LLM Reasoning Reliability for Question Answering
J. Zhu
Ye Liu
Meikai Bao
Kai Zhang
Yanghai Zhang
Qi Liu
LRM
49
0
0
26 May 2025
Optimization-Inspired Few-Shot Adaptation for Large Language Models
Optimization-Inspired Few-Shot Adaptation for Large Language Models
Boyan Gao
Xin Wang
Yibo Yang
David A. Clifton
69
0
0
25 May 2025
Flex-Judge: Think Once, Judge Anywhere
Flex-Judge: Think Once, Judge Anywhere
Jongwoo Ko
S. Kim
Sungwoo Cho
Se-Young Yun
ELMLRM
220
0
0
24 May 2025
Writing Like the Best: Exemplar-Based Expository Text Generation
Writing Like the Best: Exemplar-Based Expository Text Generation
Yuxiang Liu
Kevin Chen-Chuan Chang
46
0
0
24 May 2025
Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding
Think Before You Accept: Semantic Reflective Verification for Faster Speculative Decoding
Yixuan Wang
Yijun Liu
Shiyu Ji
Yuzhuang Xu
Yang Xu
Qingfu Zhu
Wanxiang Che
OffRLLRM
61
0
0
24 May 2025
Unraveling Misinformation Propagation in LLM Reasoning
Unraveling Misinformation Propagation in LLM Reasoning
Yiyang Feng
Yichen Wang
Shaobo Cui
Boi Faltings
Mina Lee
Jiawei Zhou
LRM
92
0
0
24 May 2025
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
The Quest for Efficient Reasoning: A Data-Centric Benchmark to CoT Distillation
Ruichen Zhang
Rana Muhammad Shahroz Khan
Zhen Tan
Dawei Li
Song Wang
Tianlong Chen
LRM
68
0
0
24 May 2025
Controlled Agentic Planning & Reasoning for Mechanism Synthesis
Joao Pedro Gandarela
Thiago Rios
Stefan Menzel
André Freitas
LLMAGLRM
56
0
0
23 May 2025
Dynamic Risk Assessments for Offensive Cybersecurity Agents
Dynamic Risk Assessments for Offensive Cybersecurity Agents
Boyi Wei
Benedikt Stroebl
Jiacen Xu
Joie Zhang
Zhou Li
Peter Henderson
90
0
0
23 May 2025
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
Navigate the Unknown: Enhancing LLM Reasoning with Intrinsic Motivation Guided Exploration
Jingtong Gao
Ling Pan
Yejing Wang
Rui Zhong
Chi Lu
Qingpeng Cai
Peng Jiang
Xiangyu Zhao
LRM
119
1
0
23 May 2025
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
SWE-Dev: Evaluating and Training Autonomous Feature-Driven Software Development
Yaxin Du
Yuzhu Cai
Yifan Zhou
Cheng-Yu Wang
Yu Qian
Xianghe Pang
Qian Liu
Yue Hu
Siheng Chen
70
0
0
22 May 2025
When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction
When Do LLMs Admit Their Mistakes? Understanding the Role of Model Belief in Retraction
Yuqing Yang
Robin Jia
KELMLRM
130
1
0
22 May 2025
SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving
SMART: Self-Generating and Self-Validating Multi-Dimensional Assessment for LLMs' Mathematical Problem Solving
Yujie Hou
Ting Zhang
Mei Wang
Xuetao Ma
Hua Huang
LRM
220
0
0
22 May 2025
HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation
HyGenar: An LLM-Driven Hybrid Genetic Algorithm for Few-Shot Grammar Generation
Weizhi Tang
Yixuan Li
Chris Sypherd
Elizabeth Polgreen
Vaishak Belle
60
0
0
22 May 2025
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning
Shivam Agarwal
Zimin Zhang
Lifan Yuan
Jiawei Han
Hao Peng
186
8
0
21 May 2025
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Reward Is Enough: LLMs Are In-Context Reinforcement Learners
Kefan Song
Amir Moeini
Peng Wang
Lei Gong
Rohan Chandra
Yanjun Qi
Shangtong Zhang
ReLMLRM
45
3
0
21 May 2025
Abacus: A Cost-Based Optimizer for Semantic Operator Systems
Abacus: A Cost-Based Optimizer for Semantic Operator Systems
Matthew Russo
Sivaprasad Sudhir
Gerardo Vitagliano
Chunwei Liu
Tim Kraska
Samuel Madden
Michael Cafarella
153
0
0
20 May 2025
BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks
BAR: A Backward Reasoning based Agent for Complex Minecraft Tasks
Weihong Du
Wenrui Liao
Binyu Yan
Hongru Liang
Anthony G. Cohn
Wenqiang Lei
LLMAGLM&Ro
148
0
0
20 May 2025
Introspective Growth: Automatically Advancing LLM Expertise in Technology Judgment
Introspective Growth: Automatically Advancing LLM Expertise in Technology Judgment
Siyang Wu
Honglin Bao
Nadav Kunievsky
James A. Evans
141
0
0
18 May 2025
LLM Context Conditioning and PWP Prompting for Multimodal Validation of Chemical Formulas
LLM Context Conditioning and PWP Prompting for Multimodal Validation of Chemical Formulas
Evgeny Markhasin
94
1
0
18 May 2025
Measuring Information Distortion in Hierarchical Ultra long Novel Generation:The Optimal Expansion Ratio
Measuring Information Distortion in Hierarchical Ultra long Novel Generation:The Optimal Expansion Ratio
Hanwen Shen
Ting Ying
80
0
0
18 May 2025
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
Retrospex: Language Agent Meets Offline Reinforcement Learning Critic
Yufei Xiang
Yiqun Shen
Yeqin Zhang
Cam-Tu Nguyen
OffRLLLMAGKELMLRM
246
3
0
17 May 2025
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning
SoftCoT++: Test-Time Scaling with Soft Chain-of-Thought Reasoning
Yige Xu
Xu Guo
Zhiwei Zeng
Chunyan Miao
BDLLRM
153
1
0
16 May 2025
Disentangling Reasoning and Knowledge in Medical Large Language Models
Disentangling Reasoning and Knowledge in Medical Large Language Models
Rahul Thapa
Qingyang Wu
Kevin Wu
Harrison Zhang
Angela Zhang
...
Joseph Boen
Shriya Reddy
Ben Athiwaratkun
Shuaiwen Leon Song
James Zou
ELMAI4MHLM&MALRM
131
2
0
16 May 2025
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory
Yexiang Liu
Zekun Li
Zhi Fang
Nan Xu
Ran He
Tieniu Tan
LRM
80
0
0
16 May 2025
Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Critique-Guided Distillation: Improving Supervised Fine-tuning via Better Distillation
Berkcan Kapusuzoglu
Supriyo Chakraborty
Chia-Hsuan Lee
Sambit Sahu
139
0
0
16 May 2025
An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents
An agentic system with reinforcement-learned subsystem improvements for parsing form-like documents
Ayesha Amjad
Saurav Sthapit
Tahir Qasim Syed
74
0
0
16 May 2025
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning
Mining Hidden Thoughts from Texts: Evaluating Continual Pretraining with Synthetic Data for LLM Reasoning
Yoichi Ishibashi
Taro Yano
Masafumi Oyamada
SyDaLRM
116
2
0
15 May 2025
Convert Language Model into a Value-based Strategic Planner
Convert Language Model into a Value-based Strategic Planner
Xiaoyu Wang
Yue Zhao
Qingqing Gu
Zhonglin Jiang
Xinyu Chen
Yong Chen
Luo Ji
LLMAG
72
0
0
11 May 2025
Text-to-CadQuery: A New Paradigm for CAD Generation with Scalable Large Model Capabilities
Text-to-CadQuery: A New Paradigm for CAD Generation with Scalable Large Model Capabilities
Haoyang Xie
Feng Ju
74
0
0
10 May 2025
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
G-FOCUS: Towards a Robust Method for Assessing UI Design Persuasiveness
Jaehyun Jeon
Janghan Yoon
Minsoo Kim
Sumin Shim
Yejin Choi
Hanbin Kim
Youngjae Yu
AAML
180
0
0
08 May 2025
Focus on the Likely: Test-time Instance-based Uncertainty Removal
Focus on the Likely: Test-time Instance-based Uncertainty Removal
Johannes Schneider
87
0
0
02 May 2025
When Reasoning Beats Scale: A 1.5B Reasoning Model Outranks 13B LLMs as Discriminator
When Reasoning Beats Scale: A 1.5B Reasoning Model Outranks 13B LLMs as Discriminator
Md Fahim Anjum
LRM
133
0
0
30 Apr 2025
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
ShorterBetter: Guiding Reasoning Models to Find Optimal Inference Length for Efficient Reasoning
Jingyang Yi
Jiazheng Wang
Sida Li
ReLMOODDLRM
443
8
0
30 Apr 2025
TTRL: Test-Time Reinforcement Learning
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo
Kaiyan Zhang
Li Sheng
Li Sheng
Xuekai Zhu
...
Youbang Sun
Zhiyuan Ma
Lifan Yuan
Ning Ding
Bowen Zhou
OffRL
444
31
0
22 Apr 2025
Previous
123456789
Next