Prompt engineering and framework: implementation to increase code reliability based guideline for LLMs
- LLMAG

In this paper, we propose a novel prompting approach aimed at enhancing the ability of Large Language Models (LLMs) to generate accurate Python code. Specifically, we introduce a prompt template designed to improve the quality and correctness of generated code snippets, enabling them to pass tests and produce reliable results. Through experiments conducted on two state-of-the-art LLMs using the HumanEval dataset, we demonstrate that our approach outperforms widely studied zero-shot and Chain-of-Thought (CoT) methods in terms of the Pass@k metric. Furthermore, our method achieves these improvements with significantly reduced token usage compared to the CoT approach, making it both effective and resource-efficient, thereby lowering the computational demands and improving the eco-footprint of LLM capabilities. These findings highlight the potential of tailored prompting strategies to optimize code generation performance, paving the way for broader applications in AI-driven programming tasks.
View on arXiv@article{cruz2025_2506.10989, title={ Prompt engineering and framework: implementation to increase code reliability based guideline for LLMs }, author={ Rogelio Cruz and Jonatan Contreras and Francisco Guerrero and Ezequiel Rodriguez and Carlos Valdez and Citlali Carrillo }, journal={arXiv preprint arXiv:2506.10989}, year={ 2025 } }