EvoGPT: Enhancing Test Suite Robustness via LLM-Based Generation and Genetic Optimization

18 May 2025

Main:6 Pages

3 Figures

Bibliography:2 Pages

5 Tables

Abstract

Large Language Models (LLMs) have recently emerged as promising tools for automated unit test generation. We introduce a hybrid framework called EvoGPT that integrates LLM-based test generation with evolutionary search techniques to create diverse, fault-revealing unit tests. Unit tests are initially generated with diverse temperature sampling to maximize behavioral and test suite diversity, followed by a generation-repair loop and coverage-guided assertion enhancement. The resulting test suites are evolved using genetic algorithms, guided by a fitness function prioritizing mutation score over traditional coverage metrics. This design emphasizes the primary objective of unit testing-fault detection. Evaluated on multiple open-source Java projects, EvoGPT achieves an average improvement of 10% in both code coverage and mutation score compared to LLMs and traditional search-based software testing baselines. These results demonstrate that combining LLM-driven diversity, targeted repair, and evolutionary optimization produces more effective and resilient test suites.

View on arXiv

@article{broide2025_2505.12424,
  title={ EvoGPT: Enhancing Test Suite Robustness via LLM-Based Generation and Genetic Optimization },
  author={ Lior Broide and Roni Stern },
  journal={arXiv preprint arXiv:2505.12424},
  year={ 2025 }
}

Comments on this paper