COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis

9 August 2024

Weiqing Yang

Hanbin Wang

Zhiyuan Liu

Abstract

Code debugging is a vital stage of software development, essential for ensuring the reliability and performance of Large Language Models (LLMs) in code generation task. Human debugging typically follows a multi-stage process, which includes Bug Localization, Bug Identification, Code Repair, and Code Recognition. However, existing code debugging benchmarks predominantly focus on the Code Repair stage, which offers only a limited perspective on evaluating the debugging capabilities of LLMs. In this paper, we introduce DEBUGEVAL, a comprehensive benchmark for evaluating the debugging abilities of LLMs by emulating the multi-stage human debugging process. Through evaluating on DEBUGEVAL, we observe that 7B-scale models consistently underperform compared to their larger counterparts, highlighting their limitations in comprehending code semantics. In this case, we propose the COmmunicative Agent-based data SynThesis (COAST) framework, which employs a multi-agent system to generate high-quality training data for supervised fine-tuning (SFT). Experimental results demonstrate that COAST-generated data outperform human-curated and GPT-4-generated data, enabling 7B-scale LLMs to achieve debugging performance comparable to GPT-3.5.

View on arXiv

@article{yang2025_2408.05006,
  title={ COAST: Enhancing the Code Debugging Ability of LLMs through Communicative Agent Based Data Synthesis },
  author={ Weiqing Yang and Hanbin Wang and Zhenghao Liu and Xinze Li and Yukun Yan and Shuo Wang and Yu Gu and Minghe Yu and Zhiyuan Liu and Ge Yu },
  journal={arXiv preprint arXiv:2408.05006},
  year={ 2025 }
}

Comments on this paper