28
0

SAS-Prompt: Large Language Models as Numerical Optimizers for Robot Self-Improvement

Heni Ben Amor
Laura Graesser
Atil Iscen
David DÁmbrosio
Saminda Abeyruwan
Alex Bewley
Yifan Zhou
Kamalesh Kalirathinam
Swaroop Mishra
Pannag Sanketi
Abstract

We demonstrate the ability of large language models (LLMs) to perform iterative self-improvement of robot policies. An important insight of this paper is that LLMs have a built-in ability to perform (stochastic) numerical optimization and that this property can be leveraged for explainable robot policy search. Based on this insight, we introduce the SAS Prompt (Summarize, Analyze, Synthesize) -- a single prompt that enables iterative learning and adaptation of robot behavior by combining the LLM's ability to retrieve, reason and optimize over previous robot traces in order to synthesize new, unseen behavior. Our approach can be regarded as an early example of a new family of explainable policy search methods that are entirely implemented within an LLM. We evaluate our approach both in simulation and on a real-robot table tennis task. Project website:this http URL

View on arXiv
@article{amor2025_2504.20459,
  title={ SAS-Prompt: Large Language Models as Numerical Optimizers for Robot Self-Improvement },
  author={ Heni Ben Amor and Laura Graesser and Atil Iscen and David DÁmbrosio and Saminda Abeyruwan and Alex Bewley and Yifan Zhou and Kamalesh Kalirathinam and Swaroop Mishra and Pannag Sanketi },
  journal={arXiv preprint arXiv:2504.20459},
  year={ 2025 }
}
Comments on this paper